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10. An English language translation of the annexes to the International Preliminary 

Examination Report under PCT Article 36 (35 U.S.C. 371(c)(5)). 
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CALCULATIONS 


BASIC NATIONAL FEE (37 CFR 1.492(a)(l)-(5): 






Neither international preliminary examination fee (37 CFR 1 .482) nor 
international search fee (37 CFR 1.445(a)(2)) paid to USPTO and 
International Search Report not prepared by the EPO or JPO $1,000 




International preliminary examination fee (37 CFR 1.482) not paid to 
USPTO but International Search Report prepared by the EPO or JPO $860 




International preliminary examination fee (37 CFR 1.482) not paid to 

USPTO but international search fee (37 CFR 1 .445(a)(2)) paid to 

USPTO $710 




International preliminary examination fee paid to USPTO (37 CFR 1.482) 
but all claims did not satisfy provisions of PCT Article 33(l)-(4) $690 




International preliminary examination fee paid to USPTO (37 CFR 1.482) 
and all claims satisfied provisions of PCT Article 33(1 )-(4) $100 






ENTER APPROPRIATE BASIC FEE AMOUNT = 


$690 




Surcharge of $130 for furnishing the oath or declaration later than 20 

30 months from the earliest claimed priority date (37 cfr 1.492(e)). 


$ 
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NUMBER FILED 


NUMBER EXTRA 


RATE 






Total claims 


25-20 = 
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X$18 


$90 




Independent 
claims 


6- 3 = 


3 


X$80 


$240 




MULTIPLE DEPENDENT CLAIMS(S) (if applicable) 


+ $270 


$270 




TOTAL OF ABOVE CALCULATIONS = 


$1290 




Applicant claims small entity status. See 37 CFR 1.27. The fees 
indicated above are reduced by 1/2. 


$ 




SUBTOTAL = 


$1290 




Processing fee of $130 for furnishing the English translation later than 20 

30 months from the earliest claimed priority date (37 cfr 1.492(f)) + 


$0 




TOTAL NATIONAL FEE = 


$1290 




Fee for recording the enclosed assignment (37 cfr 1 21(h)). The assignment must 
be accompanied by an appropriate cover sheet (37 cfr 3.28, 3.31). $40 per 
property + 


$ 




TOTAL FEES ENCLOSED = 


$1290 





Amount to be: $ 
refunded 



charged $ 

X 17a. A check in the amount of $ 1290.00 to cover the above fees is enclosed. Check 

No. 121756 
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X 17c. The Commissioner is hereby authorized to charge any additional fees which may be 

required, or credit any overpayment to Deposit Account No. 03-1740. A duplicate 
copy of this sheet is enclosed. 

SEND ALL CORRESPONDENCE TO: 

Barry F. McGurl 
CHRISTENSEN O'CONNOR JOHNSON KINDNES S PLLC 
1420 Fifth Avenue 

Suite 2800 
Seattle, WA 98101 



Respectfully submitted, 

CHRISTENSEN O'CONNOR 
JOHNSON KJNDNESS PLLC 



Barry F. McGurl \ 
Registration No. 43, Wo 
Direct Dial (206) 695-1775 

EXPRESS MAIL CERTIFICATE 

"Express Mail" mailing label number: EL599431886US 
Date of Deposit October 23, 2000 

I hereby certify that this paper or fee is being deposited with the United States Postal Service "Express 
Mail Post Office to Addressee" service under 37 C.F.R. § 1.10 on the date indicated above and is addressed to 
the Assistant Commissioner for Patents, Washington, D.C. 2023 1 . 

\Asxjr\A £\. \KJnnA 



(Typed ^rj^krted^na^^ 
(Signature of person mailing paper or fee) 
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BOX PCT 

'22 Rec'd PCT/PTO 2 3 OCT 2000 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicants: Zhi-Qiang XIA, Michael A. COSTA, Attorney Docket No. WSUR1 16430 
Laurence B. DAVIN, and 
Norman G. LEWIS 

Int'l Application No: PCT/US99/08975 Int'l Filing Date: 23 April 1999 

U.S. Application No: Priority Date Claimed: 24 April 1998 

Filed: Concurrently Herewith Examiner: 

Title: RECOMBINANT SECOISOLARICIRESINOL DEHYDROGENASE, AND METHODS 
OF USE 

PRELIMINARY AMENDMENT 
TO THE COMMISSIONER FOR PATENTS: 

Please enter the following amendments to the specification and claims of the above- 
identified patent application, which is the contemporaneously filed United States national 
application corresponding to International Application No. PCT/US99/08975: 
In the Specification : 

Amend the specification by inserting the following after the title: —This is a United 
States national stage application of International Application No. PCT/US99/08975, filed 
April 23, 1999, the benefit of the filing date of which is hereby claimed under 35 U.S.C. § 120, 
which in turn claims the benefit of U.S. Provisional Application No. 60/082,977, filed 
April 24, 1998, the benefit of the filing date of which is hereby claimed under 35 U.S.C. § 1 19.--. 

In the Claims : 

1. (Once Amended) An isolated nucleic acid molecule encoding a 
secoisolariciresinol dehydrogenase protein that hybridizes to a nucleic acid molecule selected 
from the group consisting of SEP ID NO:L SEP ID NO:3. SEP ID NO:5. SEP ID NO:7 and 
SEP ID NO: 9 or to the antisense complement of any member of the group consisting of SEP ID 
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NO:l. SEP ID NP:3. SEP ID NQ.5. SEP ID NO:7 and SEP ID NO:9 under conditions of 4 X 
SSC at 35°C . 

10. (Pnce Amended) An isolated nucleic acid molecule that hybridizes [under 
stringent conditions] to a fragment of any one of the nucleic acid molecules set forth in SEQ ID 
NP:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7 and SEQ ID NO:9, or to the antisense 
complement of any member of the group consisting of SEQ ID NP:1, SEQ ID NP:3, SEQ ID 
NP:5. SEP ID NP:7 and SEP ID NP:9 under conditions of 4 X SSC at 35°C, said fragment 
having a length of at least 1 5 bases. 

18. (Pnce Amended) A replicable expression vector comprising a nucleic acid 
sequence encoding secoisolariciresinol dehydrogenase that hybridizes to a sequence selected 
from the group consisting of SEP ID NP:1, SEQ ID NP:3, SEQ ID NO:5, SEQ ID NP:7 and 
SEQ ID NP:9, or to the antisense complement of any member of the group consisting of SEQ ID 
NP:L SEP ID NP:3, SEP ID NP:5, SEP ID NP:7 and SEP ID NP:9 under conditions of 4 X 
SSC at 35°C . 

22. (Pnce Amended) A method of enhancing the expression of secoisolariciresinol 
dehydrogenase protein in a suitable host cell comprising introducing into the host cell an 
expression vector that comprises a nucleotide sequence [ encoding a protein having the biological 
activity of a secoisolariciresinol dehydrogenase protein having the amino acid sequence set forth 
in any one of SEQ ID NP:2, SEQ ID NP:4, SEQ ID NP:6, SEQ ID NP:8 and SEQ ID NO: 10] 
that hybridizes to the antisense complement of any member of the group consisting of SEQ ID 
NO:l. SEP ID NP:3, SEP ID NP:5. SEP ID NP:7 and SEP ID NP:9 under conditions of 4 X 
SSC at 35°C . 

23. (Pnce Amended) A method of modifying the expression of secoisolariciresinol 
dehydrogenase protein in a suitable host cell comprising introducing into the host cell an 
expression vector that comprises a nucleotide sequence that expresses an RNA that hybridizes 

-2- 



under [stringent conditions] conditions of 4 X SSC at 35°C to all or part of [the] a nucleic acid 
molecule having [the] a nucleic acid sequence [set forth in] selected from the group consisting of 
SEQ ID NO:l . SEP ID NP:3. SEP ID NO:5. SEP ID NO:7 and SEP ID NQ:9. or to the 
antisense complement of any one of SEP ID NP:1. SEP ID NP:3. SEP ID NP:5. SEP ID NP:7 
and SEP ID NP:9 . 



Claims 1, 10, 18, 22 and 23 have been amended to recite specific hybridization language 
and, additionally, Claims 10 and 23 have been amended to delete the phrase "stringent 
conditions." 

Support for the foregoing claim amendments is found in the specification at least at 
pages' 36-37. No new matter has been added. 

If there are any questions, the Examiner is invited to telephone applicant's attorney at the 
number listed below. 



REMARKS 



Respectfully submitted, 



CHRISTENSEN P'CPNNPR 




Barry F. McGurl — 
Registration No. 43,340 
Direct Dial No. 206.695.1775 
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Date of Deposit October 23, 2000 
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Post Office to Addressee" service under 37 C.F.R. § 1.10 on the date indicated above and is addressed to the 
Commissioner. foK Patents, \WashingtoB, D.C. 202B 1 . /) 
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METHODS OF USE 

Field of the Invention 
The present invention relates to isolated secoisolariciresinol dehydrogenase 
5 proteins, to nucleic acid sequences which code for secoisolariciresinol dehydrogenase 
proteins, and to vectors containing the sequences, host cells containing the sequences 
and methods of producing recombinant secoisolariciresinol dehydrogenase proteins 
and their mutants. 

Background of the Invention 

10 Lignans are a large, structurally diverse, class of vascular plant metabolites 

having a wide range of physiological functions and pharmacologically important 
properties (Ayres, D.C., and Loike, J.D. in Chemistry and Pharmacology of Natural 
Products. Lignans. Chemical, Biological and Clinical Properties, Cambridge 
University Press, Cambridge, England (1990): Lewis et al.. in Chemistry of the 

15 Amazon, Biodiversity Natural Products, and Environmental Issues, 588, (P.R. Seidl. 
O.K. Gottlieb and M.A.C. Kaplan) 135-167, ACS Symposium Series, Washington 
D.C. (1995)). Because of their pronounced antibiotic properties (Markkanen, T. 
ctal.. Drugs Exptl Clin. Res 7:711-718 (1981)), antioxidant properties (Fame, M. 
etal., Phytochemistry 29:3773-3775 (1990); Osawa, T. et al., Agric. Biol Chem 

20 49:3351-3352 (1985)) and antifeedant properties (Harmatha, J., and Nawrot, L. 
Biochem. Syst. Ecol. 12:95-98 (1984)), a major role of lignans in vascular plants is to 
help confer resistance against various opportunistic biological pathogens and 
predators. Lignans have also been proposed as cytokinins (Binns, A.N. et al., Proc 
Nail. Acad. Sci. USA 84:980-984 (1987)) and as intermediates in lignification 
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(Rahman, M.M.A. et al., Phytochemistry 29:1861-1866 (1990)), suggesting a critical 
role in plant growth and development. It is widely held that elaboration of 
biochemical pathways to lignins/lignans and related substances from phenylalanine 
(tyrosine) was essential for the successful transition of aquatic plants to their vascular 
5 dry-land counterparts (Lewis, N.G., and Davin, L.B., in Isoprenoids and Other 
Natural Products. Evolution and Function, 562 (W.D. Nes, ed) 202-246, ACS 
Symposium Series: Washington, DC (1994)), some four hundred and eighty million 
years ago (Graham, L.E., Origin of Land Plants, John Wiley & Sons, Inc., New 
York, NY (1993)). 

10 Based on existing chemotaxonomic data, lignans are present in "primitive" 

plants, such as the fern Blechnum orientale (Wada, H. et al., Chem. Pharm. Bull. 
40:2099-2101 (1992)) and the homworts, e.g., Dendroceros japonicus and 
Megaceros flagellaris (Takeda, R. et al., in Bryophytes. Their Chemistry and 
Chemical Taxonomy, Vol. 29 (Zinsmeister, H.D. and Mues, R. eds) pp. 201-207, 

15 Oxford University Press: New York, NY (1990); Takeda, R. et al., Tetrahedron Lett. 
31:4159-4162 (1990)), with the latter recently being classified as originating in the 
Silurian period (Graham, L.E., J. Plant Res. 109: 241-252 (1996)). Interestingly, 
evolution of both gymnosperms and angiosperms was accompanied by major changes 
in the structural complexity and oxidative modifications of the lignans (Lewis, N.G., 

20 and Davin, L.B., in Isoprenoids and Other Natural Products. Evolution and 
Function, 562 (W.D. Nes, ed.) 202-246, ACS Symposium Series: Washington, DC 
(1994); Gottlieb, O.R., and Yoshida, M., in Natural Products of Woody Plants. 
Chemicals Extraneous to the Lignocellulosic Cell Wall (Rowe, J.W. and Kirk, C.H. 
eds.) pp. 439-511, Springer Verlag: Berlin (1989)). Indeed, in some species, such as 

25 Western Red Cedar (Thuja plicatd), lignans can contribute extensively to heartwood 
formation/generation by enhancing the resulting heartwood color, quality, fragrance 
and durability. 

In addition to their functions in plants, lignans also have important 
pharmacological roles. For example, podophyllotoxin, as its etoposide and teniposide 

30 derivatives, is an example of a plant compound that has been successfully employed as 
an anticancer agent (Ayres, DC, and Loike, J.D. in Chemistry and Pharmacology of 
Natural Products. Lignans. Chemical, Biological and Clinical Properties, Cambridge 
University Press, Cambridge, England (1990)). Antiviral properties have also been 
reported for selected lignans. For example, (-)-arctigenin (Schroder, H.C. et al., Z 

35 Naturforsch. 45c, 1215-1221 (1990)), (-)-trachelogenin (Schroder, H.C. et al., Z 
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Naturforsch. 45c, 1215-1221 (1990)) and nordihydroguaiaretic acid (Gnabre, J.N. 
et al„ Proc. Natl. Acad. Sci. USA 92:11239-11243 (1995)) are each effective against 
HIV due to their pronounced reverse transcriptase inhibitory activities. Some lignans, 
e.g., matairesinol (Nikaido, T. et al., Chem. Pharm. Bull. 29:3586-3592 (1981)), 
5 inhibit cAMP-phosphodiesterase, whereas others enhance cardiovascular activity, e.g., 
syringaresinol p-D-glucoside (Nishibe, S. et al., Chem. Pharm. Bull. 38:1763-1765 
(1990)). There is also a high correlation between the presence, in the diet, of the 
"mammalian" lignans or "phytoestrogens", enterolactone and enterodiol, formed 
following digestion of high fiber diets, and reduced incidence rates of breast and 
10 prostate cancers (so-called chemoprevention) (Axelson, M., and Setchell, K.D.R., 
FEBS Lett. 123:337-342 (1981); Adlercreutz et al., J. Steroid Biochem. Molec. Biol. 
41:3-8 (1992); Adlercreutz et al., J. Steroid Biochem. Molec. Biol. 52:97-103 
(1995)). The "mammalian lignans," in turn, are considered to be derived from lignans 
such as matairesinol and secoisolariciresinol (Boriello et al., J. Applied Bacteriol., 
15 58:37-43 (1985)). 

The biosynthetic pathways to the lignans are only now being defined. Based 
on radiolabeling experiments with crude enzyme extracts from Forsythia intermedia, 
it was first established that entry into the 8,8'-linked lignans, which represent the most 
prevalent dilignol linkage known (Davin, L.B., and Lewis, N.G., in Rec. Adv. 
20 Phytochemistry , Vol. 26 (Stafford, H.A., and Ibrahim, R.K., eds), pp. 325-375, 
Plenum Press, New York, NY (1992)), occurs via stereoselective coupling of two 
achiral coniferyl alcohol molecules, in the form of oxygenated free radicals, to afford 
the furofuran lignan (+)-pinoresinol (Davin, L.B., Bedgar, D.L., Katayama, T., and 
Lewis, N.G., Phytochemistry 31:3869-3874 (1992); Pare, P.W. et al., Tetrahedron 
25 Lett. 35:4731-4734 (1994)). 

Recently, the initial step in the 8-8' linked lignan biosynthetic pathway was 
clarified in F. intermedia (Davin, L.B., Wang, H.-B., Crowell, A.L., Bedgar, D.L., 
Martin, DM., Sarkanen, S., Lewis, N.G., Science 275:362-366 (1997)). This 
involved stereoselective monolignol coupling of two molecules of coniferyl alcohol in 
30 the presence of a 78 kDa dirigent protein and a one-electron oxidase (such as 
laccase). The one-electron oxidant is considered only to provide oxidative capacity, 
with the dirigent protein binding, orientating, and coupling the free-radical forms and 
releasing (+)-pinoresinol. The dirigent protein was purified from F. intermedia stem 
tissue and its encoding gene cloned (Gang, D R., Costa, M.A., Fujita, M., Dinkova- 
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Kostova, A T., Wang, H.B., Burlat, V., Martin, W., Sarkanen, S., Davin, L.B., Lewis, 
N.G., Chemistry & Biology 6:143-151 (1999)). 

In Forsythia intermedia, and presumably other species, (+)-pinoresinol 
undergoes sequential reduction to generate (+)-lariciresinol and then 
5 (-)-secoisolariciresinol (Katayama, T. et al., Phytochemistry 32:581-591 (1993); 
Chu, A. et al., J. Biol Chem. 268:27026-27033 (1993)). The reductions catalyzed by 
pinoresinol/lariciresinol reductase proceed via abstraction of the pro-R hydride of 
NADPH, resulting in an "inversion" of configuration at both the C-7 and C-7' 
positions of the products, (+)-lariciresinol and (-)-secoisolariciresinol (Chu, A., etal., 

10 J. Biol. Chem. 268:27026-27033 (1993)). Pinoresinol/lariciresinol reductase was 
purified -3200 fold to apparent electrophoretic homogeneity from a soluble crude 
protein extract; this was achieved by employing a series of affinity, hydrophobic 
interaction, hydroxyapatite, gel filtration, and ion exchange chromatographic steps 
(Dinkova-Kostova, A T., Gang, DR., Davin, L.B., Bedgar, D.L., Chu, A., Lewis, 

15 N.G., J. Biol. Chem. 271:29473-29482 (1996)). The purified protein was 
demonstrated to be a type A NADPH-dependent reductase. 

The corresponding pinoresinol/lariciresinol reductase gene (called plr-Fil) was 
cloned from a Forsythia cDNA library (Dinkova-Kostova, A.T., Gang, D.R., Davin, 
L.B., Bedgar, D.L., Chu, A., Lewis, N.G., J. Biol. Chem. 271:29473-29482 (1996)), 

20 and its fully functional recombinant protein then over-expressed in E. coli using a 
pET-based expression system (pSBETa vector) (Schenk, P.M., Baumann, S., Mattes, 
R., Steinbifi, H.-H., BioTechniques 19:196-200 (1995)). It was found that the only 
products formed following incubation of the recombinant pinoresinol/lariciresinol 
reductase with (±)-pinoresinols in the presence of NADPH were (+)-lariciresinol and 

25 (— )-secoisolariciresinol, i.e., only (+)-pinoresinol and (+)-lariciresinol, and not (-)- 
pinoresinol nor (-)-lariciresinol, served as substrates. Thus, the recombinant enzyme 
catalyzed exactly the same enantiospecific conversion as for the native plant protein 
from Forsythia (Dinkova-Kostova, A.T., Gang, D.R., Davin, L.B., Bedgar, D.L., 
Chu, A., Lewis, N.G., J. Biol. Chem. 271:29473-29482 (1996); Lewis, N.G., Davin, 

30 L.B., in: Comprehensive Natural Products Chemistry, Vol. 1. (Barton, Sir D.H.R., 
Nakanishi, K , and Meth-Cohn, O., eds), pp 639-712, Elsevier, London (1999)). 
(-)-Matairesinol is subsequently formed via dehydrogenation of 
(-)-secoisolariciresinol, further metabolism of which presumably affords lignans such 
as the antiviral (-)-trachelogenin in Ipomoea cairica and (-)-podophyllotoxin in 

35 Podophyllum peltatum. 
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Thus, the stereospecific formation of (+)-pinoresinol and the subsequent 
reductive steps giving (+)-lariciresinol and (-)-secoisolariciresinol are pivotal points in 
lignan metabolism, since they represent entry into the furano, dibenzylbutane, 
dibenzylbutyrolactone and aryltetrahydronaphthalene lignan subclasses. Additionally, 
5 it should be noted that while lignans are normally optically active, the particular 
enantiomer present may differ between plant species. For example, (-)-pinoresinol 
occurs in Xanthoxylum ailanthoides (Ishii et al., Yakugaku Zasshi, 103:279-292 
(1983)), and (-)-lariciresinol is present in Daphne tangutica (Lin-Gen, et al., Planta 
Medica, 45:172-176 (1982)). The optical activity of a particular lignan may have 
10 important ramifications regarding biological activity. For example, (-)-trachelogenin 
inhibits the in vitro replication of HIV- 1, whereas its (+)-enantiomer is much less 
effective (Schroder et al., Naturforsch. 45c: 121 5-1221(1990)). 

The lignan, matairesinol, is an important component of the plant arsenal that 
helps confer dietary benefits to humans, specifically against the onset of breast and 
15 prostate cancers (Adlercruetz, H. and Mazur, W. Anal Med., 1997, 29:95-120). This 
lignan is found in various whole-grain cereal food, seed and berries, and is converted 
by intestinal bacteria to form enterolactone; the latter compound is considered to be 
the primary metabolite in conferring the health protection. Additionally, the lignan, 
matairesinol, also has an important function in conferring quality, color and durability 
20 to specific heartwoods, such as the highly valued western red cedar {Thuja plicata) 
species via its conversion into plicatic acid and its congeners. Using Forsythia 
intermedia as a model system, it was established that matairesinol is formed in planta 
via dehydrogenation of secoisolariciresinol (Figure 1) (Umezawa, T., Davin, L B. and 
Lewis, N.G., Biochem. Biophys. Res. Commun., 1990, 171(3), 1008-1014; Umezawa, 
25 T., Davin, L.B., Kingston, D.G.I., Yamamoto, E. and Lewis, N.G., J. Chem. Soc, 
Chem. Commun., 1990, 1405-1408; Umezawa, T., Davin, L.B. and Lewis, N.G., 
J. Biol. Chem., 1991, 266:10210-10217). 

Summary of the Invention 
In accordance with the foregoing, a secoisolariciresinol dehydrogenase protein 
30 has been purified from Forsythia intermedia. Thus, one aspect of the invention 
relates to isolated recombinant secoisolariciresinol dehydrogenase proteins, such as, 
for example, that from Forsythia intermedia. Presently preferred, isolated 
recombinant, secoisolariciresinol dehydrogenase proteins of the present invention 
correspond to secoisolariciresinol dehydrogenase proteins that occur naturally in an 
35 angiosperm or gymnosperm plant species; have a molecular weight of from about 
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27kDa to about 3 1 kDa, more preferably about 29 kDa; an isoelectric point of from 
about 5.9 to about 6.85, and require NAD or NADP as a cofactor. 

In other aspects of the invention, cDNAs encoding secoisolariciresinol 
dehydrogenase from Forsythia intermedia have been isolated and sequenced, and the 
5 corresponding amino acid sequences have been deduced. Accordingly, the present 
invention relates to isolated DNA sequences which code for the expression of 
secoisolariciresinol dehydrogenase, such as the sequences designated SEQ ID NO:l, 
SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, which encode 
secoisolariciresinol dehydrogenase proteins designated SEQ ID NO:2, SEQ ID NO:4, 

10 SEQ ID NO: 6, SEQ ID NO: 8 and SEQ ID NO: 10, respectively, from Forsythia 
intermedia. Presently preferred DNA sequences encoding secoisolariciresinol 
dehydrogenase are isolated from a gymnosperm or angiosperm plant species. 

In another aspect, the present invention is directed to isolated nucleic acid 
molecules that hybridize under stringent hybridization conditions to a fragment 

15 (having a length of at least 15 bases) of any one of the nucleic acid molecules having 
the nucleic acid sequences set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, 
SEQ ID NO: 7 and SEQ ID NO: 9. 

Thus, the present invention relates to isolated proteins and to isolated DNA 
sequences which code for the expression of secoisolariciresinol dehydrogenase. In 

20 other aspects, the present invention is directed to replicable recombinant cloning 
vehicles comprising a nucleic acid sequence which codes for a secoisolariciresinol 
dehydrogenase protein. The present invention is also directed to a base sequence 
sufficiently complementary to at least a portion of a secoisolariciresinol 
dehydrogenase DNA or RNA to enable hybridization therewith. The aforesaid 

25 complementary base sequences include, but are not limited to: antisense 
secoisolariciresinol dehydrogenase RNA; fragments of DNA that are complementary 
to a secoisolariciresinol dehydrogenase DNA, and which are therefore useful as 
polymerase chain reaction primers, or as probes for secoisolariciresinol 
dehydrogenase genes, or related genes. 

30 In yet other aspects of the invention, modified host cells are provided that 

have been transformed, transfected, infected and/or injected with a recombinant 
cloning vehicle and/or DNA sequence of the invention. Thus, the present invention 
provides for the recombinant expression of secoisolariciresinol dehydrogenase in 
plants, animals, microbes and in cell cultures. The inventive concepts described herein 

35 may be used, for example, to facilitate the production, isolation and purification of 
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significant quantities of recombinant secoisolariciresinol dehydrogenase, or of its 
enzyme products, in plants, animals, microbes or cell cultures. 

Brief Description of the Drawings 
The foregoing aspects and many of the attendant advantages of this invention 
5 will become more readily appreciated as the same becomes better understood by 
reference to the following detailed description, when taken in conjunction with the 
accompanying drawings, wherein: 

FIGURE 1 shows the enzymatic conversion of (-)-secoisolariciresinol 
(structure on the left) to (-)-matairesinol (structure on the right) via (-) lactol 
10 (structure in the middle). 

FIGURE 2(A) shows chiral HPLC separation of a standard laboratory mixture 
of (-)-secoisolariciresinol and (+)-secoisolariciresinol. 

FIGURE (B) shows the mode of action of recombinant secoisolariciresinol 
dehydrogenase. (±)-[9,9- 3 H]secoisolariciresinols were incubated with 

15 secoisolariciresinol dehydrogenase. The resulting matairesinol product formed was 
chemically reduced and subjected to HPLC chiral column [Chiralcel OD, Daicel] 
analysis. This analysis revealed that only the (-)-antipode was present as evidenced by 
chiral column analysis of the chemically reduced product, (-)-secoisolariciresinol. 
Therefore the enzymatic reduction was entirely enantiospecific. 
20 Detailed Description of the Preferred Embodiment 

As used herein, the terms "amino acid" and "amino acids" refer to all naturally 
occurring L-a-amino acids or their residues. The amino acids are identified by either 
the single-letter or three- letter designations: 



Asp 


D 


aspartic acid 


He 


I 


isoleucine 


Thr 


T 


threonine 


Leu 


L 


leucine 


Ser 


S 


serine 


Tyr 


Y 


tyrosine 


Glu 


E 


glutamic acid 


Phe 


F 


phenylalanine 


Pro 


P 


proline 


His 


H 


histidine 


Gly 


G 


glycine 


Lys 


K 


lysine 


Ala 


A 


alanine 


Arg 


R 


arginine 


Cys 


C 


cysteine 


Trp 


W 


tryptophan 


Val 


V 


valine 


Gin 


Q 


glutamine 


Met 


M 


methionine 


Asn 


N 


asparagine 
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As used herein, the term "nucleotide" means a monomeric unit of DNA or 
RNA containing a sugar moiety (pentose), a phosphate and a nitrogenous heterocyclic 
base. The base is linked to the sugar moiety via the glycosidic carbon (1' carbon of 
pentose) and that combination of base and sugar is called a nucleoside. The base 
5 characterizes the nucleotide with the four bases of DNA being adenine ("A"), guanine 
("G"), cytosine ("C") and thymine ("T"). Inosine ("I") is a synthetic base that can be 
used to substitute for any of the four, naturally-occurring bases (A, C, G or T). The 
four RNA bases are A,G,C and uracil ("U"). The nucleotide sequences described 
herein comprise a linear array of nucleotides connected by phosphodiester bonds 
10 between the 3' and 5' carbons of adjacent pentoses. 

"Oligonucleotide" refers to short length single or double stranded sequences of 
deoxyribonucleotides linked via phosphodiester bonds. The oligonucleotides are 
chemically synthesized by known methods and purified, for example, on 
polyacrylamide gels. 

15 The terms "alteration", "amino acid sequence alteration", "variant" and "amino 

acid sequence variant" refer to secoisolariciresinol dehydrogenase molecules with 
some differences in their amino acid sequences as compared to the corresponding 
native secoisolariciresinol dehydrogenase. Ordinarily, the variants will possess at least 
about 70% homology with the corresponding, native secoisolariciresinol 
20 dehydrogenase, and preferably they will be at least about 80% homologous with the 
corresponding, native secoisolariciresinol dehydrogenase. The amino acid sequence 
variants of secoisolariciresinol dehydrogenase falling within this invention possess 
substitutions, deletions, and/or insertions at certain positions. Sequence variants of 
secoisolariciresinol dehydrogenase may be used to attain desired enhanced or reduced 
25 enzymatic activity, modified regio chemistry or stereochemistry, or altered substrate 
utilization or product distribution. 

Substitutional secoisolariciresinol dehydrogenase variants are those that have 
at least one amino acid residue in the corresponding native secoisolariciresinol 
dehydrogenase sequence removed and a different amino acid inserted in its place at 
30 the same position. The substitutions may be single, where only one amino acid in the 
molecule has been substituted, or they may be multiple, where two or more amino 
acids have been substituted in the same molecule. Substantial changes in the activity 
of the secoisolariciresinol dehydrogenase molecule may be obtained by substituting an 
amino acid with a side chain that is significantly different in charge and/or structure 
35 from that of the native amino acid. This type of substitution would be expected to 
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affect the structure of the polypeptide backbone and/or the charge or hydrophobicity 

of the molecule in the area of the substitution. 

Moderate changes in the activity of the secoisolariciresinol dehydrogenase 

molecule would be expected by substituting an amino acid with a side chain that is 
5 similar in charge and/or structure to that of the native molecule. This type of 

substitution, referred to as a conservative substitution, would not be expected to 

substantially alter either the structure of the polypeptide backbone or the charge or 

hydrophobicity of the molecule in the area of the substitution. 

Insertional secoisolariciresinol dehydrogenase variants are those with one or 
10 more amino acids inserted immediately adjacent to an amino acid at a particular 

position in the native secoisolariciresinol dehydrogenase molecule. Immediately 

adjacent to an amino acid means connected to either the a-carboxy or a-amino 

functional group of the amino acid. The insertion may be one or more amino acids. 

Ordinarily, the insertion will consist of one or two conservative amino acids. Amino 
15 acids similar in charge and/or structure to the amino acids adjacent to the site of 

insertion are defined as conservative. Alternatively, this invention includes insertion 

of an amino acid with a charge and/or structure that is substantially different from the 

amino acids adjacent to the site of insertion. 

Deletional variants are those where one or more amino acids in the native 
20 secoisolariciresinol dehydrogenase molecule have been removed. Ordinarily, 

deletional variants will have one or two amino acids deleted in a particular region of 

the secoisolariciresinol dehydrogenase molecule. 

Amino acid sequence variants of secoisolariciresinol dehydrogenase may have 

desirable altered biological activity including, for example, altered reaction kinetics, 
25 substrate utilization, product distribution or other characteristics such as 

regiochemistry and stereochemistry. 

The term "antisense" or "antisense RNA" or "antisense nucleic acid" is used 

herein to mean a nucleic acid molecule that is complementary to ail or part of a 

messenger RNA molecule. Antisense nucleic acid molecules are typically used to 
30 inhibit the expression, in vivo, of complementary, expressed messenger RNA 

molecules. 

The terms "DNA sequence encoding", "DNA encoding" and "nucleic acid 
encoding" refer to the order or sequence of deoxyribonucleotides along a strand of 
deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order 
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of amino acids along the translated polypeptide chain. The DNA sequence thus codes 
for the amino acid sequence. 

The terms "replicable expression vector" and "expression vector" refer to a 
piece of DNA, usually double-stranded, which may have inserted into it a piece of 
5 foreign DNA. Foreign DNA is defined as heterologous DNA, which is DNA not 
naturally found in the host. The vector is used to transport the foreign or 
heterologous DNA into a suitable host cell. Once in the host cell, the vector can 
replicate independently of or coincidentally with the host chromosomal DNA, and 
several copies of the vector and its inserted (foreign) DNA may be generated. In 
10 addition, the vector contains the necessary elements that permit translating the foreign 
DNA into a polypeptide. Many molecules of the polypeptide encoded by the foreign 
DNA can thus be rapidly synthesized. 

The terms "transformed host cell," "transformed" and "transformation" refer to 
the introduction of DNA into a cell. The cell is termed a "host cell", and it may be a 
15 prokaryotic or a eukaryotic cell. Typical prokaryotic host cells include various strains 
of E. coli. Typical eukaryotic host cells are plant cells, such as maize cells, yeast 
cells, insect cells or animal cells. The introduced DNA is usually in the form of a 
vector containing an inserted piece of DNA. The introduced DNA sequence may be 
from the same species as the host cell or from a different species from the host cell, or 
20 it may be a hybrid DNA sequence, containing some foreign DNA and some DNA 
derived from the host species. 

The abbreviation "SSC" refers to a buffer used in nucleic acid hybridization 
solutions. One liter of the 20X (twenty times concentrate) stock SSC buffer solution 
(pH 7.0) contains 175.3 g sodium chloride and 88.2 g sodium citrate. 
25 In accordance with the present invention, secoisolariciresinol dehydrogenase 

protein from Forsythia intermedia has been purified to apparent homogeneity via a 
>6,000 fold purification using a combination of ammonium sulfate precipitation, 
DEAE-cellulose, ADP-sepharose, and Mono P (HR 5/20) chromatography and 
columns. The N-terminus of the purified secoisolariciresinol dehydrogenase protein 
30 was sequenced to obtain the N-terminal sequence (SEQ ID NO: 11). Tryptic 
fragments of the purified secoisolariciresinol dehydrogenase protein were isolated and 
sequenced (SEQ ID NO. 12 (peptide 1) and SEQ ID NO: 13 (peptide 2)). 

The N-terminal (SEQ ID NO: 11) and internal peptide amino acid sequences 
(SEQ ID NO: 12 and SEQ ID NO: 13) were used to construct degenerate 
35 oligonucleotide primers. Primer DEHYF26 (SEQ ID NO: 14) was constructed based 



WO 99/55846 



-11- 



PCT/US99/08975 



on the amino acid sequence of peptide 1 having the amino acid sequence set forth in 
SEQ ID NO: 12. Primers DEHYF3 ORevA (SEQ ID NO: 15) and DEHYF3 ORevB 
(SEQ ID NO: 16) were each constructed based on the amino acid sequence of peptide 
2 having the amino acid sequence set forth in SEQ ID NO: 13. Purified F. intermedia 
5 cDNA library DNA (2 ng) was used as the template in PCR amplification reactions 
with primer DEHYF26 (SEQ ID NO: 14) and either primer DEHYF3 ORevA (SEQ ID 
NO: 15) or primer DEHY3 ORevB (SEQ ID NO:16). A 200 bp fragment of the 
resulting PCR product was used as a probe to screen the F. intermedia cDNA 
library. One positive signal was obtained from this screening, but this clone was 

10 estimated to be truncated at the N-terminal end by approximately 60 amino acid 
residues, as was indicated by comparison to the original N-terminal sequence analysis 
of (-)-secoisolariciresinol dehydrogenase. 

A primer, DEHY19REV (SEQ ID NO. 17), was made from the 3' end of the 
truncated clone and used with the original Forsythia cDNA library purified phage 

15 DNA as template, but failed to yield cDNA clones having the complete N-terminus. 
Consequently, another primer, DEHYF3 OREVB (SEQ ID NO: 18), was synthesized 
from the 3' end of the truncated clone and used with the T3 primer (SEQ ID NO: 19) 
in a PCR with the original Forsythia cDNA library purified phage DNA as template. 
This PCR product, when cloned into TA vector, resulted in a clone having the 

20 complete N-terminus which was obtained from the initial amino acid sequencing of 
the blotted protein (SEQ ID NO: 11). A new primer, DEHYNTERM1 (SEQ ID 
NO:20), made from the N-terminal DNA sequence of this clone was used with the T7 
primer (SEQ ED NO:21) and again with the original purified Forsythia cDNA library 
as template. The resulting PCR band of 1 kb was purified on an agarose gel, eluted 

25 by using a Microcon 30 (AMICON) and cloned directly into a TA vector 
(Invitrogen). This provided a clone (DEHY130) (SEQ ID NO:22) which had the 
DNA sequence containing the complete N-terminal amino acid sequence present in 
the original protein (SEQ ID NO: 11). The amino acid sequence (SEQ ID NO:23) 
encoded by DEHY130 (SEQ ID NO: 22) was lacking a start methionine. A new 

30 5(primer, designated DEHY 1 3 0NTERM (SEQ ID NO:24), was synthesized to 
include a start methionine at the beginning of the sequence. Also, the 5' primer (SEQ 
ID NO:24) and a 3' primer, designated DEHY130CTERM (SEQ ID NO:25), were 
designed to incorporate Nde I restriction enzyme sites at both ends of the clone for 
future insertion into the SBET expression vector for production of the protein in 

35 E. coli The resulting PCR product of approximately 859 bp (SEQ ID NO:l), 
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designated DEHY133, was cloned directly into a TA vector (Invitrogen). The DNA 
sequence indicated that the DEHY133 dehydrogenase clone (SEQ ID NO:l) now 
contained a Met start codon. 

In addition, the Nde I fragment from the engineered DEHY133 clone (SEQ ID 
5 NO:l) was used as a probe to re-screen 300,000 pfu from the original F. intermedia 
cDNA library. This resulted in the isolation of additional secoisolariciresinol 
dehydrogenase clones. The nucleic acid sequences of four of these clones are set 
forth in: SEQ ID NO: 3 (designated SMDEHY321), SEQ ID NO: 5 (designated 
SMDEHY431), SEQ ID NO:7 (designated SMDEHY511), SEQ ID NO:9 
10 (designated SMDEHY631). Some of these clones, such as SMDEHY321 (SEQ ID 
NO:3) and SMDEHY63 1 (SEQ ID NO:9) produced proteins mE. coli that catalyzed 
the stereochemical conversion of (-)-secoisolariciresinol into (-)-matairesinol. 

The isolation of cDNAs encoding secoisolariciresinol dehydrogenase permits 
the development of an efficient expression system for this functional enzyme; provides 
15 useful tools for examining the developmental regulation of lignan biosynthesis and 
permits the isolation of other secoisolariciresinol dehydrogenases. The isolation of 
the secoisolariciresinol dehydrogenase cDNAs also permits the transformation of a 
wide range of organisms in order to enhance or modify lignan biosynthesis. 

By way of non-limiting examples, the proteins and nucleic acids of the present 
20 invention can be utilized to: elevate or otherwise alter the levels of health-protecting 
lignans, including phytoestrogens such as enterolactone and enterodiol, in plant 
species, including but not limited to vegetables, grains and fruits, and to food items 
incorporating material derived from such genetically altered plants, genetically alter 
plant species to provide an abundant, natural supply of lignans useful for a variety of 
25 purposes, for example as neutriceuticals and dietary supplements; to genetically alter 
living organisms to produce an abundant supply of optically pure lignans having 
desirable biological properties, for example (-)-trachelogenin which possesses antiviral 
properties, and (-)-podophyllotoxin. 

N-terminal transport sequences well known in the art (see, e.g., von Heijne, G. 
30 et al., Eur. J. Biochem 180:535-545 (1989); Stryer, Biochemistry W.H. Freeman and 
Company, New York, NY, p. 769 (1988)) may be employed to direct 
secoisolariciresinol dehydrogenase protein to a variety of cellular or extracellular 
locations. 

Sequence variants of wild-type secoisolariciresinol dehydrogenase clones that 
35 can be produced by deletions, substitutions, mutations and/or insertions are intended 
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to be within the scope of the invention except insofar as limited by the prior art. 
Secoisolariciresinol dehydrogenase amino acid sequence variants may be constructed 
by mutating the DNA sequence that encodes wild-type secoisolariciresinol 
dehydrogenase, such as by using techniques commonly referred to as site-directed 
5 mutagenesis. Various polymerase chain reaction (PCR) methods now well known in 
the field, such as a two primer system like the Transformer Site-Directed Mutagenesis 
kit from Clontech, may be employed for this purpose. 

Following denaturation of the target plasmid in this system, two primers are 
simultaneously annealed to the plasmid; one of these primers contains the desired site- 

10 directed mutation, the other contains a mutation at another point in the plasmid 
resulting in elimination of a restriction site. Second strand synthesis is then carried 
out, tightly linking these two mutations, and the resulting plasmids are transformed 
into a mutS strain of E. coli. Plasmid DNA is isolated from the transformed bacteria, 
restricted with the relevant restriction enzyme (thereby linearizing the unmutated 

15 plasmids), and then retransformed into E. coli. This system allows for generation of 
mutations directly in an expression plasmid, without the necessity of subcloning or 
generation of single- stranded phagemids. The tight linkage of the two mutations and 
the subsequent linearization of unmutated plasmids results in high mutation efficiency 
and allows minimal screening. Following synthesis of the initial restriction site primer, 

20 this method requires the use of only one new primer type per mutation site. Rather 
than prepare each positional mutant separately, a set of "designed degenerate" 
oligonucleotide primers can be synthesized in order to introduce all of the desired 
mutations at a given site simultaneously. Transformants can be screened by 
sequencing the plasmid DNA through the mutagenized region to identify and sort 

25 mutant clones. Each mutant DNA can then be restricted and analyzed by 
electrophoresis on Mutation Detection Enhancement gel (J.T. Baker) to confirm that 
no other alterations in the sequence have occurred (by band shift comparison to the 
unmutagenized control). 

The verified mutant duplexes can be cloned into a replicable expression 

30 vector, if not already cloned into a vector of this type, and the resulting expression 
construct used to transform E. coli, such as strain E. coli 1 BL21(DE3)pLysS, for high 
level production of the mutant protein, and subsequent purification thereof. The 
method of FAB-MS mapping can be employed to rapidly check the fidelity of mutant 
expression. This technique provides for sequencing segments throughout the whole 

35 protein and provides the necessary confidence in the sequence assignment. In a 
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mapping experiment of this type, protein is digested with a protease (the choice will 
depend on the specific region to be modified since this segment is of prime interest 
and the remaining map should be identical to the map of unmutagenized protein). The 
set of cleavage fragments is fractionated by microbore HPLC (reversed phase or ion 
5 exchange, again depending on the specific region to be modified) to provide several 
peptides in each fraction, and the molecular weights of the peptides are determined by 
FAB-MS The masses are then compared to the molecular weights of peptides 
expected from the digestion of the predicted sequence, and the correctness of the 
sequence quickly ascertained Since this mutagenesis approach to protein 

10 modification is directed, sequencing of the altered peptide should not be necessary if 
the MS agrees with prediction. If necessary to verify a changed residue, CAD-tandem 
MS/MS can be employed to sequence the peptides of the mixture in question, or the 
target peptide purified for subtractive Edman degradation or carboxypeptidase Y 
digestion depending on the location of the modification. 

15 In the design of a particular site directed mutant, it is generally desirable to 

first make a non-conservative substitution (e.g., Ala for Cys, His or Glu) and 
determine if activity is greatly impaired as a consequence. The properties of the 
mutagenized protein are then examined with particular attention to the kinetic 
parameters of K m and k cat as sensitive indicators of altered function, from which 

20 changes in binding and/or catalysis per se may be deduced by comparison to the 
native enzyme. If the residue is by this means demonstrated to be important by 
activity impairment, or knockout, then conservative substitutions can be made, such 
as Asp for Glu to alter side chain length, Ser for Cys, or Arg for His. For 
hydrophobic segments, it is largely size that will be altered, although aromatics can 

25 also be substituted for alkyl side chains. Changes in the normal product distribution 
can indicate which step(s) of the reaction sequence have been altered by the mutation. 

Other site directed mutagenesis techniques may also be employed with the 
nucleotide sequences of the invention. For example, restriction endonuclease 
digestion of DNA followed by ligation may be used to generate secoisolariciresinol 

30 dehydrogenase deletion variants, as described in Section 15.3 of Sambrook et al. 
(Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory 
Press, New York, NY (1989)). A similar strategy may be used to construct insertion 
variants, as described in Section 15.3 of Sambrook et al., supra. 

Oligonucleotide-directed mutagenesis may also be employed for preparing 

35 substitution variants of this invention. It may also be used to conveniently prepare the 
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deletion and insertion variants of this invention. This technique is well known in the 
art as described by Adelman et al. {DNA 2:183 (1983)). Generally, oligonucleotides 
of at least 25 nucleotides in length are used to insert, delete or substitute two or more 
nucleotides in the secoisolariciresinol dehydrogenase gene. An optimal 
5 oligonucleotide will have 12 to 15 perfectly matched nucleotides on either side of the 
nucleotides coding for the mutation. To mutagenize the wild-type secoisolariciresinol 
dehydrogenase, the oligonucleotide is annealed to the single-stranded DNA template 
molecule under suitable hybridization conditions. A DNA polymerizing enzyme, 
usually the Klenow fragment of E. coli DNA polymerase I, is then added. This 

10 enzyme uses the oligonucleotide as a primer to complete the synthesis of the 
mutation-bearing strand of DNA. Thus, a heteroduplex molecule is formed such that 
one strand of DNA encodes the wild-type secoisolariciresinol dehydrogenase inserted 
in the vector, and the second strand of DNA encodes the mutated form of 
secoisolariciresinol dehydrogenase inserted into the same vector. This heteroduplex 

1 5 molecule is then transformed into a suitable host cell. 

Mutants with more than one amino acid substituted may be generated in one 
of several ways. If the amino acids are located close together in the polypeptide 
chain, they may be mutated simultaneously using one oligonucleotide that codes for 
all of the desired amino acid substitutions. If however, the amino acids are located 

20 some distance from each other (separated by more than ten amino acids, for example) 
it is more difficult to generate a single oligonucleotide that encodes all of the desired 
changes. Instead, one of two alternative methods may be employed. In the first 
method, a separate oligonucleotide is generated for each amino acid to be substituted. 
The oligonucleotides are then annealed to the single-stranded template DNA 

25 simultaneously, and the second strand of DNA that is synthesized from the template 
will encode all of the desired amino acid substitutions. 

An alternative method involves two or more rounds of mutagenesis to 
produce the desired mutant. The first round is as described for the single mutants: 
wild-type secoisolariciresinol dehydrogenase DNA is used for the template, an 

30 oligonucleotide encoding the first desired amino acid substitution(s) is annealed to this 
template, and the heteroduplex DNA molecule is then generated. The second round 
of mutagenesis utilizes the mutated DNA produced in the first round of mutagenesis 
as the template. Thus, this template already contains one or more mutations. The 
oligonucleotide encoding the additional desired amino acid substitution(s) is then 

35 annealed to this template, and the resulting strand of DNA now encodes mutations 
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from both the first and second rounds of mutagenesis. This resultant DNA can be 
used as a template in a third round of mutagenesis, and so on. 

Eukaryotic expression systems may be utilized for secoisolariciresinol 
dehydrogenase production since they are capable of carrying out any required 
5 postradiational modifications and of directing the enzyme to the proper membrane 
location. A representative eukaryotic expression system for this purpose uses the 
recombinant baculovirus, Autographa californica nuclear polyhedrosis virus 
(AcNPV; M.D Summers and G.E. Smith, A Manual of Methods for Baculovirus 
Vectors and Insect Cell Culture Procedures (1986); Luckow et al., Bio-technology 
10 6:47-55 (1987)) for expression of the secoisolariciresinol dehydrogenases of the 
invention. Infection of insect cells (such as cells of the species Spodoptera 
frugiperda) with the recombinant baculoviruses allows for the production of large 
amounts of the secoisolariciresinol dehydrogenase protein. In addition, the 
baculovirus system has other important advantages for the production of recombinant 
15 secoisolariciresinol dehydrogenase. For example, baculoviruses do not infect humans 
and can therefore be safely handled in large quantities. In the baculovirus system, a 
DNA construct is prepared including a DNA segment encoding secoisolariciresinol 
dehydrogenase and a vector. The vector may comprise the polyhedron gene promoter 
region of a baculovirus, the baculovirus flanking sequences necessary for proper 
20 cross-over during recombination (the flanking sequences comprise about 200-300 
base pairs adjacent to the promoter sequence) and a bacterial origin of replication 
which permits the construct to replicate in bacteria. The vector is constructed so that 
(i)the DNA segment is placed adjacent (or operably-linked or "downstream" or 
"under the control of) to the polyhedron gene promoter and (ii) the 
25 promoter/secoisolariciresinol dehydrogenase combination is flanked on both sides by 
200-300 base pairs of baculovirus DNA (the flanking sequences). 

To produce a secoisolariciresinol dehydrogenase DNA construct, a cDNA 
clone encoding a full length secoisolariciresinol dehydrogenase is obtained using 
methods such as those described herein. The DNA construct is contacted in a host 
30 cell with baculovirus DNA of an appropriate baculovirus (that is, of the same species 
of baculovirus as the promoter encoded in the construct) under conditions such that 
recombination is effected. The resulting recombinant baculoviruses encode the full 
secoisolariciresinol dehydrogenase. For example, an insect host cell can be 
cotransfected or transfected separately with the DNA construct and a functional 
35 baculovirus. Resulting recombinant baculoviruses can then be isolated and used to 
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infect cells to effect production of secoisolariciresinol dehydrogenase. Host insect 
cells include, for example, Spodoptera frugiperda cells. Insect host cells infected with 
a recombinant baculovirus of the present invention are then cultured under conditions 
allowing expression of the baculovirus-encoded secoisolariciresinol dehydrogenase. 
5 Recombinant protein thus produced is then extracted from the cells using methods 
known in the art. 

Other eukaryotic microbes such as yeasts may also be used to practice this 
invention. The baker's yeast Saccharomyces cerevisiae, is a commonly used yeast, 
although several other strains are available. The plasmid YRp7 (Stinchcomb et al., 
10 Nature 282:39 (1979), Kingsman et al., Gene 7:141 (1979); Tschemper et al., Gene 
10:157 (1980)) is commonly used as an expression vector in Saccharomyces. This 
plasmid contains the trpl gene that provides a selection marker for a mutant strain of 
yeast lacking the ability to grow in tryptophan, such as strains ATCC No. 44,076 and 
PEP4-1 (Jones, Genetics 85:12 (1977)). The presence of the trpl lesion as a 
1 5 characteristic of the yeast host cell genome then provides an effective environment for 
detecting transformation by growth in the absence of tryptophan. Yeast host cells are 
generally transformed using the polyethylene glycol method, as described by Hinnen 
(Proc. Natl. Acad. Sci. USA 75:1929 (1978)). Additional yeast transformation 
protocols are set forth in Gietz et al., N.A.R. 20(17): 1425 (1992); Reeves et al., 
20 FEMS 99:193-197 (1992). 

Suitable promoting sequences in yeast vectors include the promoters for 
3-phosphoglycerate kinase (Hitzeman et al., J. Biol Chem. 255:2073 (1980)) or other 
glycolytic enzymes (Hess et al., J. Adv. Enzyme Reg. 7:149 (1968); Holland et al., 
Biochemistry 17:4900 (1978)), such as enolase, glyceraldehyde-3 -phosphate 
25 dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6- 
phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate 
isomerase, phosphoglucose isomerase, and glucokinase. In the construction of 
suitable expression plasmids, the termination sequences associated with these genes 
are also ligated into the expression vector 3' of the sequence desired to be expressed 
30 to provide polyadenylation of the mRNA and termination. Other promoters that have 
the additional advantage of transcription controlled by growth conditions are the 
promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, 
degradative enzymes associated with nitrogen metabolism, and the aforementioned 
glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for maltose and 
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galactose utilization. Any plasmid vector containing yeast-compatible promoter, 
origin of replication and termination sequences is suitable. 

Cell cultures derived from multicellular organisms, such as plants, may be used 
as hosts to practice this invention. Transgenic plants can be obtained, for example, by 
5 transferring plasmids that encode secoisolariciresinol dehydrogenase, and a selectable 
marker gene, e.g., the kan gene encoding resistance to kanamycin, into 
Agrobacterium tumifaciens containing a helper Ti plasmid as described in Hoeckema 
etal., Nature 303:179-181 (1983) and culturing the Agrobacterium cells with leaf 
slices of the plant to be transformed as described by An et al., Plant Physiology 

10 81:301-305 (1986). Transformation of cultured plant host cells is normally 
accomplished through Agrobacterium tumifaciens, as described above. Cultures of 
mammalian host cells and other host cells that do not have rigid cell membrane 
barriers are usually transformed using the calcium phosphate method as originally 
described by Graham and VanderEb {Virology 52:546 (1978)) and modified as 

15 described in Sections 16.32-16.37 of Sambrook et al., supra. However, other 
methods for introducing DNA into cells such as Polybrene (Kawai and Nishizawa, 
Mol. Cell. Bio I. 4:1172 (1984)), protoplast fusion (Schaffher, Proc. Natl. Acad. Sci. 
USA 77:2163 (1980)), electroporation (Neumann et al., EMBO J. 1:841 (1982)), and 
direct microinjection into nuclei (Capecchi, Cell 22:479 (1980)) may also be used. 

20 Additionally, animal transformation strategies are reviewed in Monastersky G.M. and 
Robl, J.M., Strategies in Transgenic Animal Science, ASM Press, Washington, D.C. 
(1995). Transformed plant calli may be selected through the selectable marker by 
growing the cells on a medium containing, e.g., kanamycin, and appropriate amounts 
of phytohormone such as naphthalene acetic acid and benzyladenine for callus and 

25 shoot induction. The plant cells may then be regenerated and the resulting plants 
transferred to soil using techniques well known to those skilled in the art. 

In addition, a gene regulating secoisolariciresinol dehydrogenase production 
can be incorporated into the plant along with a necessary promoter which is inducible. 
In the practice of this embodiment of the invention, a promoter that only responds to 

30 a specific external or internal stimulus is fused to the target cDNA. Thus, the gene 
will not be transcribed except in response to the specific stimulus. As long as the 
gene is not being transcribed, its gene product is not produced. 

An illustrative example of a responsive promoter system that can be used in 
the practice of this invention is the glutathione-S-transferase (GST) system in maize. 

35 GSTs are a family of enzymes that can detoxify a number of hydrophobic electrophilic 
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compounds that often are used as pre-emergent herbicides (Weigand et al., Plant 
Molecular Biology 7:235-243 (1986)). Studies have shown that the GSTs are 
directly involved in causing this enhanced herbicide tolerance. This action is primarily 
mediated through a specific 1.1 kb mRNA transcription product. In short, maize has 
5 a naturally occurring quiescent gene already present that can respond to external 
stimuli and that can be induced to produce a gene product. This gene has previously 
been identified and cloned. Thus, in one embodiment of this invention, the promoter 
is removed from the GST responsive gene and attached to a secoisolariciresinol 
dehydrogenase gene that previously has had its native promoter removed. This 

10 engineered gene is the combination of a promoter that responds to an external 
chemical stimulus and a gene responsible for successful production of 
secoisolariciresinol dehydrogenase protein. 

In addition to the methods described above, several methods are known in the 
art for transferring cloned DNA into a wide variety of plant species, including 

15 gymnosperms, angiosperms, monocots and dicots (see, e.g., Glick and Thompson, 
eds., Methods in Plant Molecular Biology, CRC Press, Boca Raton, Florida (1993)). 
Representative examples include electroporation-facilitated DNA uptake by 
protoplasts (Rhodes et al., Science 240(4849):204-207 (1988)); treatment of 
protoplasts with polyethylene glycol (Lyznik et al., Plant Molecular Biology 

20 13:151-161 (1989)); and bombardment of cells with DNA laden microprojectiles 
(Klein et al, Plant Physiol 91:440-444 (1989) and Boynton et al., Science 
240(4858): 1534-1538 (1988)). Numerous methods now exist, for example, for the 
transformation of cereal crops (see, e.g., McKinnon, G.E. and Henry, R.J., J. Cereal 
Science, 22(3):203-210 (1995); Mendel, R.R. and Teeri, T.H., Plant and Microbial 

25 Biotechnology Research Series, 3:81-98, Cambridge University Press (1995); 
McElroy, D. and Brettell, R.I.S., Trends in Biotechnology, 12(2):62-68 (1994); 
Christou et al., Trends in Biotechnology, 10(7):239-246 (1992); Christou, P. and 
Ford, T.L., Annals of Botany, 75(5): 449-454 (1995); Park et al., Plant Molecular 
Biology, 32(6): 1 135-1 148 (1996); Altpeter etal., Plant Cell Reports, 16:12-17 

30 (1996)). Additionally, plant transformation strategies and techniques are reviewed in 
Birch, R.G., Ann Rev Plant Phys Plant Mol Biol 48:297 (1997); Forester et al., Exp. 
Agric. 33:15-33 (1997). Minor variations make these technologies applicable to a 
broad range of plant species. Each of the foregoing publications disclosing methods 
for genetically transforming plants are incorporated herein by reference. 
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By way of non-limiting example, in the practice of the present invention the 
following plant genuses and species can be genetically transformed with a nucleic acid 
molecule encoding a secoisolariciresinol dehydrogenase protein, and/or a nucleic acid 
molecule that is complementary to at least a portion of a nucleic acid molecule 
5 encoding a secoisolariciresinol dehydrogenase protein: Arachis (including peanut); 
Arecacum (including oil palms); Brassica (including arugula, bok choi, brocolli, 
brussel sprouts, cabbage, cauliflower, kale, mustard, radishes, rape, turnip, raddichio); 
Carthamus (including safflower); Cocos, (including coconut); Gossypium (including 
cotton); Glycine (including soybeans); Helianthus (including sunflower, Jerusalem 
10 artichoke); Linum (including flax); Sesamum (including sesame); Agaricus (including 
table mushrooms); Amoracia (including horseradish); Allium (including chives, garlic, 
leek, onion); Apicum (including celery); Asparagus (including asparagus); Beta 
(including beets, sugar beets); Camellia (including tea); Capsicum (including bell, 
chile and other peppers); Chenopodiacum (including swiss chard, spinach); Cicer 
15 (including chick peas, garbanzos); Chicorum (including endive); Coffea (including 
coffee); Convolvutacum (including sweet potato), Coriandrum (including coriander, 
cilantro); Cynara (including artichoke); Daucus (including carrots); Discorum 
(including yams); Hibiscus (including okra); Lactuca (including bibb, boston, iceberg, 
leaf and other lettuces); Lens (including lentils); Pastinaca (including parsnip); 
20 Phaseolus (including field, kidney, navy, pinto, wax beans), Pisum (including peas, 
snow peas, sugar snap peas); Rheum (including rhubarb); Solanum (including 
eggplant and potatoes); Vigna (including adzuki bean, blackeyed peas, mung beans); 
Carya (including pecan); Corylus (including hazelnut); Cucumis (including cucumber, 
melon); Cucurbita (including pumpkin, squash, zucchini); Juglans (including walnut); 
25 Olea, (including olives); Prunus, (including almonds); Pistacia (including pistachio); 
Zea; Sorghum, Hordeum; Elusine, Panicum; Paspalum; Pennisetum, Setera, Avena, 
Oryza; Secale, Triticum; Actinidia (including Kiwi); Carica (includes papaya); Citrus 
(including grapefruit, lemon, orange, tangerine); Fragaria (including strawberries); 
Lycopersicom (including tomato); Malus (including apples); Mangifera (including 
30 mango); Musa (including bananas); Prunus (including apricots, cherries, nectarines, 
peaches, plums); Pyrus (including pears, Asian pears); Ribes (including currants, 
gooseberries); Rubus (including blackberry, raspberry); Vaccinium (including 
blueberries, cranberries, lingonberries); Vitis (including grapes). 

Each of the foregoing plant transformation techniques has advantages and 
35 disadvantages. In each of the techniques, DNA from a plasmid is genetically 
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engineered such that it contains not only the gene of interest, but also selectable and 
screenable marker genes. A selectable marker gene is used to select only those cells 
that have integrated copies of the plasmid (the construction is such that the gene of 
interest and the selectable and screenable genes are transferred as a unit). The 
5 screenable gene provides another check for the successful culturing of only those cells 
carrying the genes of interest. A commonly used selectable marker gene is neomycin 
phosphotransferase II (NPT II). This gene conveys resistance to kanamycin, a 
compound that can be added directly to the growth media on which the cells grow. 
Plant cells are normally susceptible to kanamycin and, as a result, die. The presence 

10 of the NPT II gene overcomes the effects of the kanamycin and each cell with this 
gene remains viable. Another selectable marker gene which can be employed in the 
practice of this invention is the gene which confers resistance to the herbicide 
glufosinate (Basta). A screenable gene commonly used is the (3-glucuronidase gene 
(GUS). The presence of this gene is characterized using a histochemical reaction in 

15 which a sample of putatively transformed cells is treated with a GUS assay solution. 
After an appropriate incubation, the cells containing the GUS gene turn blue. 
Preferably, the plasmid will contain both selectable and screenable marker genes. 

The plasmid containing one or more of these genes is introduced into either 
plant protoplasts or callus cells by any of the previously mentioned techniques. If the 

20 marker gene is a selectable gene, only those cells that have incorporated the DNA 
package survive under selection with the appropriate phytotoxic agent. Once the 
appropriate cells are identified and propagated, plants are regenerated. Progeny from 
the transformed plants must be tested to insure that the DNA package has been 
successfully integrated into the plant genome. 

25 Mammalian host cells may also be used in the practice of the invention. 

Examples of suitable mammalian cell lines include monkey kidney CVI line 
transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney line 293S 
(Graham et al., J. Gen. Virol. 36:59 (1977)); baby hamster kidney cells (BHK, 
ATCC CCL 10); Chinese hamster ovary cells (Urlab and Chasin, Proc. Natl. Acad 

30 Set USA 77:4216 (1980)); mouse Sertoli cells (TM4, Mather, Biol. Reprod. 23:243 
(1980)); monkey kidney cells (CVT-76, ATCC CCL 70), African green monkey 
kidney cells (VERO-76, ATCC CRL-1587); human cervical carcinoma cells (HELA, 
ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34); buffalo rat liver cells 
(BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver 

35 cells (Hep G2, HB 8065); mouse mammary tumor cells (MMT 060562, 
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ATCC CCL 51); rat hepatoma cells (HTC, MI.54, Baumann et al., J. Cell Biol. 85:1 
(1980)); and TRI cells (Mather et al., Annals N.Y. Acad. Sci. 383:44 (1982)). 
Expression vectors for these cells ordinarily include (if necessary) DNA sequences for 
an origin of replication, a promoter located in front of the gene to be expressed, a 
5 ribosome binding site, an RNA splice site, a polyadenylation site, and a transcription 
terminator site. 

Promoters used in mammalian expression vectors are often of viral origin. 
These viral promoters are commonly derived from polyoma virus, Adenovirus 2, and 
most frequently Simian Virus 40 (SV40). The SV40 virus contains two promoters 
10 that are termed the early and late promoters. These promoters are particularly useful 
because they are both easily obtained from the virus as one DNA fragment that also 
contains the viral origin of replication (Fiers et al., Nature 273:1 13 (1978)). Smaller 
or larger SV40 DNA fragments may also be used, provided they contain the 
approximately 250-bp sequence extending from the Hindlll site toward the Bgll site 
15 located in the viral origin of replication. 

Alternatively, promoters that are naturally associated with the foreign gene 
(homologous promoters) may be used provided that they are compatible with the host 
cell line selected for transformation. 

An origin of replication may be obtained from an exogenous source, such as 
20 SV40 or other virus (e.g., Polyoma, Adeno, VSV, BPV) and inserted into the cloning 
vector. Alternatively, the origin of replication may be provided by the host cell 
chromosomal replication mechanism. If the vector containing the foreign gene is 
integrated into the host cell chromosome, the latter is often sufficient. 

The use of a secondary DNA coding sequence can enhance production levels 
25 of secoisolariciresinol dehydrogenase protein in transformed cell lines. The secondary 
coding sequence typically comprises the enzyme dihydrofolate reductase (DHFR). 
The wild-type form of DHFR is normally inhibited by the chemical methotrexate 
(MTX). The level of DHFR expression in a cell will vary depending on the amount of 
MTX added to the cultured host cells. An additional feature of DHFR that makes it 
30 particularly useful as a secondary sequence is that it can be used as a selection marker 
to identify transformed cells. Two forms of DHFR are available for use as secondary 
sequences, wild-type DHFR and MTX-resistant DHFR. The type of DHFR used in a 
particular host cell depends on whether the host cell is DHFR deficient (such that it 
either produces very low levels of DHFR endogenously, or it does not produce 
35 functional DHFR at all). DHFR-deficient cell lines such as the CHO cell line 
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described by Urlaub and Chasin, supra, are transformed with wild-type DHFR coding 
sequences. After transformation, these DHFR-deficient cell lines express functional 
DHFR and are capable of growing in a culture medium lacking the nutrients 
hypoxanthine, glycine and thymidine. Nontransformed cells will not survive in this 
5 medium. 

The MTX-resistant form of DHFR can be used as a means of selecting for 
transformed host cells in those host cells that endogenously produce normal amounts 
of functional DHFR that is MTX sensitive. The CHO-K1 cell line (ATCC No. CL 61) 
possesses these characteristics, and is thus a useful cell line for this purpose. The 

10 addition of MTX to the cell culture medium will permit only those cells transformed 
with the DNA encoding the MTX-resistant DHFR to grow. The nontransformed cells 
will be unable to survive in this medium. 

Prokaryotes may also be used as host cells for the initial cloning steps of this 
invention. They are particularly useful for rapid production of large amounts of DNA, 

1 5 for production of single-stranded DNA templates used for site-directed mutagenesis, 
for screening many mutants simultaneously, and for DNA sequencing of the mutants 
generated. Suitable prokaryotic host cells include E. coli K12 strain 294 (ATCC 
No. 31,446), E.coli strain W3110 (ATCC No. 27,325) E. coli X1776 (ATCC 
No. 31,537), and E. coli B; however many other strains of E. coli, such as HB101, 

20 JM101, NM522, NM538, NM539, and many other species and genera of prokaryotes 
including bacilli such as Bacillus subtilis, other enterobacteriaceae such as Salmonella 
typhimurium or Serratia marcesans, and various Pseudomonas species may all be 
used as hosts. Prokaryotic host cells or other host cells with rigid cell walls are 
preferably transformed using the calcium chloride method as described in section 1.82 

25 of Sambrook et al., supra. Alternatively, electroporation may be used for 
transformation of these cells. Prokaryote transformation techniques are set forth in 
Dower, W. J., in Genetic Engineering, Principles and Methods, 12:275-296, Plenum 
Publishing Corp. (1990); Hanahan et al., Meth. Enzymol, 204:63 (1991). 

As a representative example, cDNA sequences encoding secoisolariciresinol 

30 dehydrogenase may be transferred to the (His) 6 «Tag pET vector commercially 
available (from Novagen) for overexpression in E. coli as heterologous host. This 
pET expression plasmid has several advantages in high level heterologous expression 
systems. The desired cDNA insert is ligated in frame to plasmid vector sequences 
encoding six histidines followed by a highly specific protease recognition site 

35 (thrombin) that are joined to the amino terminus codon of the target protein. The 
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histidine "block" of the expressed fusion protein promotes very tight binding to 
immobilized metal ions and permits rapid purification of the recombinant protein by 
immobilized metal ion affinity chromatography. The histidine leader sequence is then 
cleaved at the specific proteolysis site by treatment of the purified protein with 
5 thrombin, and the secoisolariciresinol dehydrogenase protein eluted. This 
overexpression-purification system has high capacity, excellent resolving power and is 
fast, and the chance of a contaminating E. coli protein exhibiting similar binding 
behavior (before and after thrombin proteolysis) is extremely small. 

As will be apparent to those skilled in the art, any plasmid vectors containing 
10 replicon and control sequences that are derived from species compatible with the host 
cell may also be used in the practice of the invention. The vector usually has a 
replication site, marker genes that provide phenotypic selection in transformed cells, 
one or more promoters, and a polylinker region containing several restriction sites for 
insertion of foreign DNA. Plasmids typically used for transformation of E. coli 
15 include pBR322, pUC18, pUC19, pUCI18, pUC119, and Bluescript M13, all of 
which are described in Sections 1.12-1.20 of Sambrook et al., supra. However, many 
other suitable vectors are available as well. These vectors contain genes coding for 
ampicillin and/or tetracycline resistance which enables cells transformed with these 
vectors to grow in the presence of these antibiotics. 
20 The promoters most commonly used in prokaryotic vectors include the 

3-lactamase (penicillinase) and lactose promoter systems (Chang et al. Nature 
375:615 (1978); Itakura et al., Science 198:1056 (1977); Goeddel et al., Nature 
281:544 (1979)) and a tryptophan (trp) promoter system (Goeddel et al., Nucl Acids 
Res. 8:4057 (1980); EPO Appl. Publ. No. 36,776), and the alkaline phosphatase 
25 systems. While these are the most commonly used, other microbial promoters have 
been utilized, and details concerning their nucleotide sequences have been published, 
enabling a skilled worker to ligate them functionally into plasmid vectors (see 
Siebenlist et al., Cell 20:269 (1 980)). 

Many eukaryotic proteins normally secreted from the cell contain an 
30 endogenous secretion signal sequence as part of the amino acid sequence. Thus, 
proteins normally found in the cytoplasm can be targeted for secretion by linking a 
signal sequence to the protein. This is readily accomplished by ligating DNA 
encoding a signal sequence to the 5' end of the DNA encoding the protein and then 
expressing this fusion protein in an appropriate host cell. The DNA encoding the 
35 signal sequence may be obtained as a restriction fragment from any gene encoding a 



WO 99/55846 



-25- 



PCT/US99/08975 



protein with a signal sequence. Thus, prokaryotic, yeast, and eukaryotic signal 
sequences may be used herein, depending on the type of host cell utilized to practice 
the invention. The DNA and amino acid sequence encoding the signal sequence 
portion of several eukaryotic genes including, for example, human growth hormone, 
5 proinsulin, and prealbumin are known (see Stryer, Biochemistry W.H. Freeman and 
Company, New York, NY, p. 769 (1988)), and can be used as signal sequences in 
appropriate eukaryotic host cells. Yeast signal sequences, as for example acid 
phosphatase (Arima et al., Nucleic Acids Res. 11:1657 (1983)), alpha-factor, alkaline 
phosphatase and invertase may be used to direct secretion from yeast host cells. 

10 Prokaryotic signal sequences from genes encoding, for example, LamB or OmpF 
(Wong et al., Gene 68:193 (1988)), MalE, PhoA, or beta-lactamase, as well as other 
genes, may be used to target proteins from prokaryotic cells into the culture medium. 

Trafficking sequences from plants, animals and microbes can be employed in 
the practice of the invention to direct the gene product to the cytoplasm, endoplasmic 

15 reticulum, mitochondria or other cellular components, or to target the protein for 
export to the medium. These considerations apply to the overexpression of 
secoisolariciresinol dehydrogenase, and to direction of expression within cells or 
intact organisms to permit gene product function in any desired location. 

The construction of suitable vectors containing DNA encoding replication 

20 sequences, regulatory sequences, phenotypic selection genes and the 
secoisolariciresinol dehydrogenase DNA of interest are prepared using standard 
recombinant DNA procedures. Isolated plasmids and DNA fragments are cleaved, 
tailored, and ligated together in a specific order to generate the desired vectors, as is 
well known in the art (see, for example, Sambrook et al., supra). 

25 As discussed above, secoisolariciresinol dehydrogenase variants, are preferably 

produced by means of mutation(s) that are generated using the method of site- specific 
mutagenesis. This method requires the synthesis and use of specific oligonucleotides 
that encode both the sequence of the desired mutation and a sufficient number of 
adjacent nucleotides to allow the oligonucleotide to stably hybridize to the DNA 

30 template. 

A secoisolariciresinol dehydrogenase gene, or an antisense nucleic acid 
fragment complementary to all or part of a secoisolariciresinol dehydrogenase gene, 
may be introduced, as appropriate, into any plant species for a variety of purposes 
including, but not limited to: altering or improving the color, texture, durability and 
35 pest-resistance of wood tissue, especially heartwood tissue; reducing the formation, or 
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otherwise altering the levels, of lignans and/or lignins in plant species, such as corn, 
which are useful as animal fodder, thereby enhancing the availability of the cellulose 
fraction of the plant material to the digestive system of animals ingesting the plant 
material; reducing, or otherwise altering the levels of, the lignan/lignin content of 
5 plant species utilized in pulp and paper production, thereby making pulp and paper 
production easier and cheaper; improving the defensive capability of a plant against 
predators and pathogens by enhancing the production of defensive lignans or lignins; 
the alteration of other ecological interactions mediated by lignans or lignins; 
producing elevated levels of optically-pure lignan enantiomers as medicines or food 

10 additives; introducing, enhancing or inhibiting the production of secoisolariciresinol 
dehydrogenases, or the production of matairesinol and its derivatives. A 
secoisolariciresinol dehydrogenase gene may be introduced into any organism for a 
variety of purposes including, but not limited to: introducing, enhancing or inhibiting 
the production of secoisolariciresinol dehydrogenase, or the production of 

15 matairesinol and its derivatives. Any art-recognized technique, utilizing a nucleic acid 
molecule of the present invention, can be used to enhance, inhibit or otherwise alter 
the production of secoisolariciresinol dehydrogenase, or the production of 
matairesinol and its derivatives. 

The foregoing may be more fully understood in connection with the following 

20 representative examples, in which "Plasmids" are designated by a lower case p 
followed by an alphanumeric designation. The starting plasmids used in this invention 
are either commercially available, publicly available on an unrestricted basis, or can be 
constructed from such available plasmids using published procedures. In addition, 
other equivalent plasmids are known in the art and will be apparent to the ordinary 

25 artisan. 

"Digestion", "cutting" or "cleaving" of DNA refers to catalytic cleavage of the 
DNA with an enzyme that acts only at particular locations in the DNA. These 
enzymes are called restriction endonucleases, and the site along the DNA sequence 
where each enzyme cleaves is called a restriction site. The restriction enzymes used in 

30 this invention are commercially available and are used according to the instructions 
supplied by the manufacturers. (See also Sections 1.60-1.61 and Sections 3.38-3.39 
of Sambrook et al., supra.) 

"Recovery" or "isolation" of a given fragment of DNA from a restriction 
digest means separation of the resulting DNA fragment on a polyacrylamide or an 

35 agarose gel by electrophoresis, identification of the fragment of interest by 
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comparison of its mobility versus that of marker DNA fragments of known molecular 
weight, removal of the gel section containing the desired fragment, and separation of 
the gel from DNA. This procedure is known generally. For example, see Lawn et al. 
{Nucleic Acids Res. 9:6103-6114 (1982)), and Goeddel et al. {Nucleic Acids Res., 
5 supra). 

The following examples merely illustrate the best mode now contemplated for 
practicing the invention, but should not be construed to limit the invention. 

Example 1 

Isolation of Secoisolariciresinol Dehydrogenase Protein from 
10 Forsythia intermedia 

The following materials and methods were utilized in Examples 1 and 2, unless 
otherwise stated 

Plant materials - F. intermedia plants were either obtained from Bailey's 
Nursery (var. Lynwood Gold, St. Paul, MN), and maintained in Washington State 
15 University greenhouse facilities, or were gifts from the local community. 

Materials - All solvents and chemicals used were reagent or HPLC grade. 
DEAE cellulose and Adenosine 2',5'-diphosphate-Sepharose were purchased from 
Sigma; MonoP HR-5/20 and SDS-PAGE molecular weight standards were obtained 
from Pharmacia LKB Biotechnology, Inc. Taq thermostable DNA polymerase and 
20 restriction enzymes (BamH I, Nde I, Spe I) were obtained from Promega. pT7Blue 
T-vector and competent NovaBlue cells were purchased from Novagen and 
radiolabeled nucleotide ([oc- 32 P]dCTP) was from DuPont NEN. Oligonucleotide 
primers for polymerase chain reaction (PCR) and sequencing were synthesized by 
Gibco BRL Life Technologies. 
25 Instrumentation - *H and 13 C nuclear magnetic resonance spectra were 

recorded on a Bruker AMX300 using CDC1 3 as solvent with chemical shifts (5 ppm) 
reported downfield from tetramethylsilane (internal standard). High performance 
liquid chromatography was carried out using either reversed phase (Waters, Nova- 
pak Cig, 150 x 3.9 mm inner diameter) or chiral (Daicel, Chiralcel OD, 250 x 4.6 mm 
30 inner diameter) columns with detection at 280 nm. Radioactive samples were 
analyzed in ScintiVerse II (Fisher Scientific) and measured using a liquid scintillation 
counter (Packard, Tricarb 2000 CA). Mass spectra (EI mode) were obtained using a 
Waters Integrity™ System equipped with a Thermabeam™ Mass Detector. Amino 
acid sequences were obtained using an Applied Biosystems protein sequencer with 
35 on-line HPLC detection, according to the manufacturer's instructions. UV (including 
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RNA and DNA determinations at 260 nm) spectra were recorded on a Lambda 6 
UV/VIS spectrophotometer. A Temptronic II thermocycler (Thermolyne) was used 
for all PCR amplifications. Purification of plasmid DNA for sequencing employed a 
Wizard Plus ( SV Miniprep DNA Purification System (Promega), with DNA sequences 
5 determined using an Applied Biosystems Model 373A automated sequencer. 

Synthesis of (±)-[9,9'- 3 H 2 ]Secoisolariciresinols - To [9- 3 H 2 ]coniferyl alcohol 
(0.5 mM in acetone, 65 MBq, 7 ml) was added FeCl 3 (aqueous solution, 700 mg, 
24 ml), at room temperature. Following stirring for 10 min, the reaction mixture was 
extracted with ether (30 ml x 3). The ether solubles were combined, extracted with 

10 water (20 ml), dried with Na 2 S0 4 , and evaporated to dryness in vacuo. The residue 
was reconstituted in a minimum amount of CH 2 C1 2 and applied to a silica gel column 
(15 x 2.5 cm inner diameter) eluted with CH 2 Cl 2 :ether (4:1) to give pure (±)-[9,9'- 
3 H 2 ]pinoresinol (0.1 mM, 13 MBq, 36 mg, 20%). To a stirred solution of (±)-[9.9'- 
3 H 2 ]pinoresinols (0.1 mM in MeOH, 13 MBq, 5 ml) was added Pd/C (10 %, 80 mg) 

15 under H 2 . After 24 h reduction, the catalyst was removed by filtration, washed with 
MeOH (5 ml); the MeOH solubles were combined and evaporated to dryness 
in vacuo to afford, following preparative silica TLC (developed with 
EtOAc:hexanes:methanol 10:10:1), (±)-[9,9'- 3 H 2 ]secoisolariciresinols (0.07 mM, 9.1 
MBq, 25 mg, 70 %). 

20 Synthesis of (±)-[Ar- 2 H]secoisolariciresinol - [Ar- 2 H]secoisolariciresinol 

was synthesized as described in Umezawa, T., Davin, L.B. and Lewis, N.G., J. Biol. 
Chem., 266: 10210-10217(1991). 

Enzyme Assays - (1) Radiochemical Assays with 
(±)-[9,9- 3 H 2 ]secoisolariciresinols - Secoisolariciresinol dehydrogenase activity was 

25 assayed by monitoring the formation of (-)-[9'- 3 H 2 ]matairesinol. Each assay 
consisted of NAD (50 mM in 0.1 M potassium phosphate buffer, pH 7, 5 ul), (±)- 
[9,9'- 3 H 2 ]secoisolariciresinols (28 nM, 130 MBq/mmol in ethanol, 5 ul) and buffer 
(50 mM Tris-HCl, pH 8.8, 470 ul). The enzymatic reaction was initiated by addition 
of the enzyme preparation (20 ul). After 1 h incubation at 30°C with shaking, the 

30 mixture was extracted with EtOAc (500 ul x 2) containing unlabelled (±)- 
matairesinols (200 ug) as radiochemical carriers. After centrifugation (13,800 x^, 5 
min) the EtOAc solubles were removed, evaporated to dryness in vacuo, 
reconstituted in MeOH:3% acetic acid in H 2 0 (1:1, 200 ul), and an aliquot (20 pi) 
subjected to reversed-phase column chromatography. The elution conditions were as 

35 follows: linear gradient acetonitrile/3 % acetic acid in H 2 0 from 10:90 to 30:70 
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between 0 and 35 min; then to 5:95 in 5 min and finally isocratic at 5:95 for 5 min, at 
a flow rate of 1 ml min" 1 , and detection at 280 nm. Fractions corresponding to 
matairesinol were individually collected, aliquots removed for liquid scintillation 
counting, and the remainder freeze-dried. 
5 (2) Assays with (±)-[Ar- 2 H]secoisolariciresinol - Two mg [Ar- 2 H] 

secoisolariciresinol in 500 ul EtoH was added into 10 mL 50 mM pH 8.8 Tris-HCl 
buffer, which had ca. 2 \ig dehydrogenase and 40 umol NAD. After lhr incubation at 
30°C with shaking, the mixture was extracted with EtoAc (10 ml x 2). The solvent 
was evaporated and the extract was purified by HPLC. The matairesinol peak was 

10 collected, freeze-dried, and gave 0.8 mg matairesinol when analyzed by MS. 

Chemical Conversion of Enzymatically Formed [9'- 3 H 2 ]Matairesinol to 
[9'- 3 H 2 ] Secoisolariciresinol - [9'- 3 H 2 ]Matairesinol (0.5 kBq), collected after 
reversed-phase column chromatography, was reduced with LiAUrL, to give 
[9'- 3 H 2 ]secoisolariciresinol (0.26 kBq). Chiral HPLC (Daicel OD) analysis revealed 

15 that only (-)-[9- 3 H]secoisolariciresinol was formed, indicating that only (-)-[9- 
3 H]matairesinol had been enzymatically generated. 

Synthesis of Lactol - To matairesinol (in toluene, 0.5 mM, 2 ml) was added 
diisobutylaluminium hydride (in hexanes, 1 M, 0.6 ml) dropwise at -78°C. The 
reaction mixture was stirred for one hour at -78°C, quenched with a few drops of 

20 HC1 (2 N), then extracted with EtOAc (20 ml). The EtOAc solubles were extracted 
with water (6 ml), evaporated to dryness in vacuo and subjected to preparative TLC 
(developed with EtOAc:hexanes:methanol 10:10:1) to afford the required lactol (0.35 
mM, 70%). 

Secoisolariciresinol dehydrogenase protein was isolated from Forsythia 
25 intermedia, and partially sequenced, in the following manner. 

General Procedures for the Enzyme Purification - All manipulations were 
carried out at 4°C with chromatographic eluents monitored at 280 nm, unless 
otherwise indicated. Protein concentrations, using y-globulin as standard, were 
determined by the method of Bradford (Bradford, M. M., Analyt. Biochem., 117: 248 
30 (1976)). Polyacrylamide gel electrophoresis was performed with Laemmli's buffer 
system under denaturing or non-denaturing conditions and gradient gels (4-15%, 
BioRad) (Laemmli, U. K., Nature, 227, 680 (1970)); proteins were then visualized by 
silver staining (Morrissey, J. H., Anal. Biochem., 117, 307(1981)). 

Preparation of Cell-free Extracts - F. intermedia stems (2 kg) were frozen 
35 (liquid N 2 ) and pulverized in a Waring Blender (Model CB6). The resulting powder 
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was homogenized with Tris-HCl buffer (50 mM, pH 7.5, 2 L) containing 5 mM 
dithiothreitol (buffer A). The homogenate was filtered through four layers of 
cheesecloth into a beaker containing polyvinylpolypyrrolidone (10 % w/v). The 
filtrate was centrifuged (10,000 x g, 15 min) and the resulting supernatant 
5 fractionated with (NH^SC^. Proteins precipitating between 30-60% saturation were 
recovered by centrifugation (10,000 x g, 30 min) with the pellet then reconstituted in 
a minimum amount of buffer A. 

DEAE Chromatography - The crude enzyme preparation (445 mg in 90 ml 
buffer A, 4.1 nmol h" 1 mg" 1 ) was applied to a DEAE cellulose column (40 x 2.6 cm 

10 inner diameter) equilibrated in buffer A. Secoisolariciresinol dehydrogenase was 
eluted (after washing the column with 25 ml of buffer A) with a linear NaCl gradient 
(0-2 M in 500 ml) in buffer A at a flow rate of 2.5 ml min" 1 . Active fractions were 
combined, concentrated by ultrafiltration (Amicon, YM10 membrane) to 50 ml and 
dialyzed (25 mM Tris-HCl buffer, pH 7.5) overnight. 

15 Affinity (2\5'-ADP-Sepharose) Chromatography - The active fractions 

from the DEAE cellulose chromatography (201 mg, 14.4 nmol h" 1 mg" 1 ) were applied 
to a 2',5'-ADP-Sepharose (10 x 1 cm inner diameter) column previously equilibrated 
in Tris-HCl buffer (25 mM, pH 7.5). The column was first washed with 20 ml of the 
same buffer, then with 50 ml Buffer A containing 500 mM NaCl at a flow rate of 1 ml 

20 min" 1 and finally secoisolariciresinol dehydrogenase was eluted with NAD (10 mM) in 
buffer A. The active fractions were combined and dialyzed 16 hours against buffer A. 

MonoP (HR 5/20) Column Chromatography - Active protein (185 ug, 
8405 nmol h" 1 mg" 1 ) from the preceding step was applied to a MonoP column 
equilibrated in buffer A, washed with buffer A (8 ml) and eluted with a linear NaCl 

25 gradient (0-2 M in 145 ml) in buffer A at a flow rate of 1 ml min" 1 . The active 
fractions (74 fig, 17.7 umol hf 1 mg" 1 ) were combined, dialyzed against buffer A, then 
rechromatographed on the MonoP column using the procedure described above. 
Secoisolariciresinol dehydrogenase (31 ug, 24.27 umol h" 1 mg" 1 ) obtained was next 
analyzed by SDS-PAGE. 

30 Amino Acid Sequencing - (-)-Secoisolariciresinol dehydrogenase was first 

submitted to SDS-polyacrylamide gel electrophoresis and then electroblotted onto a 
PVDF membrane using the procedures described by Hunkapiller et al. (Hunkapiller, 
M., Methods Enzymol, 91:227) and Matsudaira (Matsudaira, P., J. Biol. Chem., 
262:10035 (1987)), respectively. Briefly, a minigel was first electrophoresed in 

35 running buffer (25 mM Tris, 192 mM glycine and 0.1% SDS) containing reduced 
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glutathione (5 uM) for 30 min at 8 mA constant current after which the cathode 
buffer was replaced with fresh running buffer containing 0. 1 M thioglycolate. 

To (-)-secoisolariciresinoi dehydrogenase was added loading buffer, with the 
mixture next heated at 55°C for 15 min, loaded onto the minigel and then 
5 electrophoresed at 20 mA constant current for 45 min. After electrophoresis, the gel 
was soaked in transfer buffer (10 mM CAPS, 10% methanol, pH 11.0) for 10 min, 
then placed into a blotting apparatus and electroeluted for 1 hour at 1 50 mA constant 
current in transfer buffer at 4°C. After staining with Coomassie blue R-250, the band 
corresponding to (-)-secoisolariciresinol dehydrogenase was cut, rinsed with 
10 deionized H 2 0 and directly submitted to amino acid sequencing to obtain the 
N-terminal sequence (SEQ ID NO:l 1). 

Trypsin digestion - To the pure (-)-secoisolariciresinol dehydrogenase (200 
pmol in 60 ul water), was added urea to give a final concentration of 8 M. After 
incubation at 37°C for 30 min, 0.2 M ammonium bicarbonate/1 mM CaCl 2 (60 ul) and 
15 trypsin (0.5 ug/ul in 0.01% TFA, 2 ul) were added, and the mixture digested at 37°C 
for 12 h after which more trypsin (2 ul) was added, with the digestion allowed to 
continue for another 12 h. The enzymatic reaction was stopped by addition of TFA 
(4 ul). The resulting mixture, subjected to reversed-phase HPLC analysis (C-8 
column, Applied Biosystems), was eluted with a linear gradient from 0 to 100% 
20 acetonitrile (in 0.1% TFA) in 2 hours at a flow rate of 0.2 ml min 1 with detection at 
214 nm. Fractions containing individual oligopeptide peaks were collected manually, 
concentrated (SpeedVac) and submitted to amino acid sequencing as before. The 
amino acid sequence of two secoisolariciresinol dehydrogenase trypstn-liberated 
oligopeptides are set forth in SEQ ID NO: 12 (peptide 1) and SEQ ID NO: 13 
25 (peptide 2). 

Example 2 

Cloning of Secoisolariciresinol Dehydrogen ase cDNAs from 
Forsythia intermedia 
F intermedia Stem cDNA Library Synthesis - Total RNA (approximately 
30 300 ug g" 1 fresh weight) was obtained (Dong, J. Z. and Dunstan, D. I., Plant Cell 
Reports 15:516-521(1996)) from young green stems of greenhouse grown F. 
intermedia plants (var. Lynwood Gold). An F. intermedia stem cDNA library was 
constructed using 5 ug of purified poly A+ mRNA (OHgotex-dt(Suspension, 
QIAGEN) with the ZAP-cDNA ( II Gold packaging extract (Stratagene), with a titer of 
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1.2 x 10 6 pfu for the primary library. A 30 ml portion of the amplified library (1.2 x 
10 10 pfu/ml; 158 ml total) (Sambrook, J., Fritsch, E. F. and Maniatis, T. (1989) 
Molecular Cloning, Edition 2, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor) was used to obtain pure cDNA library DNA for PCR (Ausubel, F. M., Brent, 
5 R., Kingston, R. E., Moore, D. D., Seidman, J. G., Smith, J. A. and Struhl, K. (1991) 
Current Protocols in Molecular Biology. 2 vols., Greene Publishing Associates and 
Wiley Interscience, John Wiley & Sons, New York, NY). 

(-)-Secoisolariciresinol Dehydrogenase DNA Probe Synthesis - The 
N-terminal (SEQ ID NO: 11) and internal peptide amino acid sequences (SEQ ID 
10 NO: 12 and SEQ ID NO: 13) were used to construct degenerate oligonucleotide 
primers. Primer DEHYF26 (SEQ ID NO: 14) was constructed based on the amino 
acid sequence of peptide 1 having the amino acid sequence set forth in SEQ ID 
NO: 12. Primers DEHYF3 ORev A (SEQ ID NO: 15) and DEHYF3 ORe vB (SEQ ID 
NO: 16) were each constructed based on the amino acid sequence of peptide 2 having 
15 the amino acid sequence set forth in SEQ ID NO: 13. Purified F. intermedia cDNA 
library DNA (2 ng) was used as the template in 100 ul PCR reactions (10 raM Tris- 
HC1 [pH 9.0], 50 mM KC1, 0.1% Triton X100, 2.5 mM MgCl 2 , 0.2 mM each dNTP 
and 2.5 units Taq DNA polymerase) with primer DEHYF26 (SEQ ID NO: 14) and 
either primer DEHYF3 ORev A (SEQ ID NO: 15) or primer DEHY3 ORe vB (SEQ ID 
20 NO: 16). PCR amplification was carried out in a thermocycler with 35 cycles of 94°C 
denaturing for 1 min, 50°C annealing for 2 min, and 72°C extension for 3 min. PCR 
products were resolved in 1.5% agarose gels where a single band of approximately 
200 bp was obtained The resulting PCR product was then ligated into pT7Blue 
T-vector and transformed into competent NovaBlue cells according to Novagen's 
25 instructions. The recombinant plasmid was used for DNA sequencing. DNA 
sequence analysis revealed that the insert coded for one of the initial internal trypsin 
digest fragments obtained from the native plant protein. A BamH I / Spe I fragment 
of approximately 200 bp was cut from the plasmid preparation and used as a probe to 
screen the cDNA library. 
30 Library screening - approximately 300,000 pfu of F. intermedia amplified 

cDNA library were plated for screening, according to Stratagene's instructions. 
Plaques were blotted onto Magna Nylon membrane circles (Micron Separations Inc.), 
which were then allowed to air dry. The membranes were placed between two layers 
of Whatman 3 MM Chromatography paper. cDNA library phage were fixed to the 
35 membranes and denatured in one step by autoclaving for 2 min at 100°C with fast 
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exhaust. The membranes were washed for 30 min at 37°C in 2 X SSC and 
prehybridized for 12 h with gentle shaking at 45°C in a hybridization solution 
consisting of 6 X SSC, 0.5% SDS, and SXDenhardt's reagent. The [ 32 P]radiolabeled 
200 bp probe was denatured by boiling for 10 min, quickly cooled on ice for 10 min, 
5 and added to the prehybridized membranes in 30 ml of fresh hybridization solution. 
Hybridization was performed at 45°C for 24 h with gentle shaking. Membranes were 
then washed in 4 X SSC at room temperature for 10 min, followed by an additional 
wash in 4 X SSC at 45°C for 10 min. Membranes were exposed to X-ray film (Jersey 
Lab Supply) with intensifying screens at -80°C for 24 h. 
10 One positive signal was obtained from this screening which, after three rounds 

of screening, was in vivo excised and grown for a plasmid prep to use for sequencing. 
A BLAST search comparison showed that the protein encoded for by this gene had a 
similarity of 76% to an alcohol dehydrogenase from Solanum lycopersicum 
(Jacobsen, S.E. and Olszewski, N.E., Planta, 198:78(1996)). However, the clone 
15 was truncated at the N-terminal end by approximately 60 amino acid residues, as was 
indicated by comparison to the original N-terminal sequence analysis of 
(-)-secoisolariciresinol dehydrogenase. Additional screenings of the Forsythia cDNA 
library using similar hybridization conditions were performed with probes obtained 
from restriction enzyme digested fragments of the truncated clone. These probes 
20 yielded only one additional clone which had the same sequence and the same 
truncation as the original clone. 

An alternative scheme was used to obtain the complete clone from the original 
cDNA library stock. A primer, DEHY19REV (SEQ ID NO: 17), was made from the 
3' end of the truncated clone and used with the T3 primer in a PCR with the original 
25 Forsythia cDNA library purified phage DNA as template, but failed to yield cDNA 
clones having the complete N-terminus. Consequently, another primer, 
DEHYF3 ORE VB (SEQ ID NO . 18), was synthesized from the 3' end of the truncated 
clone and used with the T3 primer (SEQ ID NO: 19) in a PCR with the original 
Forsythia cDNA library purified phage DNA as template. This PCR product, when 
30 cloned into TA vector, resulted in a clone having the complete N-terminus (SEQ ID 
NO: 11) which was obtained from the initial amino acid sequencing of the blotted 
protein. A new primer, DEHYNTERMl (SEQ ID NO:20), made from the 
N-terminal DNA sequence of this clone was used with the T7 primer (SEQ JD 
NO:21) and again with the original purified Forsythia cDNA library as template. The 
35 resulting PCR band of 1 kb was purified on an agarose gel, eluted by using a 
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Microcon 30 (AMICON) and cloned directly into a TA vector (Invitrogen). This 
provided a clone (DEHY130) (SEQ ID NO:22) which had the DNA sequence 
containing the complete N-terminal amino acid sequence present in the original 
protein (SEQ ID NO: 11). The amino acid sequence (SEQ ID NO:23) encoded by 
5 DEHY130 (SEQ ID NO:22) was lacking a start methionine, but comparison with 
database sequences showing similarity to this protein indicated that, at the most, 
apparently only 2 to 3 amino acid residues may be lacking, if at all, in addition to a 
start methionine. Based on this information, a new 5' primer, designated 
DEHY 1 3 ONTERM (SEQ ID NO: 24), was synthesized to include a start methionine 

10 at the beginning of the sequence. Also, the 5' primer (SEQ ID NO: 24) and a 3' 
primer, designated DEHY 1 3 OCTERM (SEQ ID NO:25), were designed to 
incorporate Nde I restriction enzyme sites at both ends of the clone for future 
insertion into the SBET expression vector (14) for production of the protein in 
E. coli. These new primers (SEQ ID NO:24 and SEQ ID NO:25) were used for PCR 

15 with 2 ng of plasmid DNA of the previously obtained DEHY130 clone (SEQ ID 
NO:22) as template. The resulting PCR product of approximately 859 bp (SEQ ID 
NO:l), designated DEHY133, was cloned directly into a TA vector (Invitrogen). The 
DNA sequence indicated that the DEHY133 dehydrogenase clone now contained a 
Met start codon. 

20 In addition, the Nde I fragment from the engineered DEHY133 clone (SEQ ID 

NO:l) in the TA vector was used as a probe to re-screen 300,000 pfu from the 
original F. intermedia cDNA library. This resulted in numerous strong signals, of 
which 11 were isolated and screened further. All of the isolated clones provided 
sequences either similar to, or identical to, the original DEHY133 clone (SEQ ID 

25 NO: 1). A few of these had additional residues at the N-terminal and contained a start 
Met, which confirmed that only a few of the N-terminal residues were lacking from 
the original DEHY130 clone (SEQ ID NO: 22). The nucleic acid sequences of four of 
these clones are set forth in: SEQ ID NO: 3 (designated SMDEHY321), SEQ ID 
NO:5 (designated SMDEHY431), SEQ ID NO:7 (designated SMDEHY511), SEQ 

30 ID NO:9 (designated SMDEHY631). Some of these clones, such as SMDEHY133 
(SEQ ID NO: 1) and SMDEHY63 1 (SEQ ID NO:9) produced proteins that catalyzed 
the stereochemical conversion of (-)-secoisolariciresinol into (-)-matairesinol, as set 
forth below. 

Expression in E. coli of (-)-Secoisolariciresinol Dehydrogenase. Since the 
35 engineered DEHY133 (SEQ ID NO:l) construct was also in correct reading frame 
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with the lacZ in the original TA cloning vector (Invitrogen), an initial screening for 
dehydrogenase activity was conducted using the product from an E. coli culture 
harboring this plasmid. The dehydrogenase coding region was also excised using the 
Nde I sites at the 5' and 3' ends and cloned into the SBET vector This construct was 
5 then transformed into B834(DE3), an E. coli strain for overexpression of the cloned 
dehydrogenase protein. 

The E. coli culture containing the dehydrogenase clone was grown at 37°C in 
25 ml of SOC Kn50 medium to an O.D. of 0.5. To this was added IPTG to give a 
final concentration of 0.5 mM and the culture was grown at 18°C for an additional 
10 20 h. The cells were pelleted at 600 x g 4°C 12 min, resuspended in 5 ml of 20 mM 
Tris-HCl pH 8.0, 5 mM DTT buffer and repelleted. The final bacterial pellet was 
resuspended in 200 uL of the above buffer and sonicated 4 x 15 sec using a Braun- 
Sonic 2000 sonicator set at maximal output of -0.64. The sample was then 
centrifuged 20,800 x g 4°C 15 min and the crude supernate was assayed for 
15 dehydrogenase activity. This protein catalyzed the conversion of labelled 
(-)-secoisolariciresinol substrate into an intermediary (-)-lactol and further conversion 
to (-)-matairesinol (see Figures 1 and 2). The (+)-antipode of secoisolariciresinol did 
not serve as a substrate. The clone SMDEHY631 (SEQ ID NO:9) was also 
expressed in E. coli in the foregoing manner. 
20 Example 3 

Hybridization of Secoisolariciresinol Dehydrogenase 
cDNA SMPEHY631 (SEQ ID NO: 9) to Messenger RNA Molecules 

Encoding Secoisolariciresinol Dehydrogenase 
The following procedure was utilized to detect mRNA molecules that encode 
25 secoisolariciresinol dehydrogenase in other plant species. Total RNA was isolated 
from the following plant species: Forsythia intermedia (control); Podophyllum 
peltatum (a species that synthesizes the lignan podophyllotoxin); Linum flavum (a 
species that synthesizes the lignan podophyllotoxin) and Thuja plicata (a species that 
synthesizes the lignan plicatic acid). Total RNA was isolated from young leaf tissue 
30 by the lithium chloride precipitation method (Dong, J.-Z. & Dunstan, D. I. Plant Cell 
Reports 15:516-521(1996)). Radiolabeled probe (SMDEHY63 1 (SEQ ID NO:9)) 
was prepared using the Pharmacia T7 Quickprime Kit #27-9252-0 1 . The EcoRI/XhoI 
fragment containing secoisolariciresinol dehydrogenase clone was separated in low 
melting point (LMP) agarose (GIBCO/BRL Ultrapure LMP Agarose), with the 
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agarose liquefied using AgarAce enzyme (Promega). The isolated probe DNA was 
boiled for 10 minutes then cooled quickly and briefly on ice and held at 37°C for 10 
minutes. The reaction buffer mixture, enzyme and radioisotope a- 32 P-dCTP were 
added and the fragment was incubated for 20 minutes at 37°C. The labeled DNA 
5 fragment was then separated from unincorporated free radionucleotides by passing 
through a Centri-Spin 20 column (Princeton Separation). 

Total RNA from the foregoing plant species was separated on a 1.3% 
agarose/formaldehyde gel and blotted onto Amersham Hybond nylon membrane in 
10X SSC for 18 hr. The blotted membrane was prehybridized for 5 hr at 42° C in a 

10 prehybridization solution having the following composition: 5X SSPE; 150 ug/ml 
sheared salmon sperm DNA; 2X Denhardt's solution; 1% SDS; 0.05X BLOTTO and 
50% formamide. 0.2 ml prehybridization solution were used per square centimeter of 
membrane. A 50 X stock solution of Denhart's solution contains 5 g Ficoll (Type 
400, Pharmacia), 5 g of polyvinylpyrrolidone, 5 g of bovine serum albumin (Fraction 

15 V, Sigma) and water to 500 ml. Hybridization was conducted for 16 hr at 42°C in the 
same solution that was used for prehybridization. After hybridization was complete, 
the blot was washed in the following manner: three times in 2X SSPE 30° C for 5 
min per wash; then once in 2X SSPE/0.5% SDS at 30° C for 10 min. A single 
hybridizing mRNA band of approximately 1Kb was visible in each of the blotted RNA 

20 samples. 

Example 4 

Hybridization Under Stringent Hybridization Conditions 

In one aspect, the present invention provides isolated nucleic acid molecules 
that hybridize under stringent hybridization conditions to a fragment (having a length 

25 of at least 15 bases) of any one of the nucleic acid molecules set forth in SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO: 5, SEQ ID NO:7 and SEQ ID NO:9. 
Hybridization under stringent hybridization conditions is achieved as follows. For 
high stringency hybridization, nitrocellulose membranes (or other membranes suitable 
for blotting nucleic acid molecules) are hybridized in 6 X SSC, 5 X Denhardt's, 0.5% 

30 SDS at 55°C for at least one hour. The hybridized filters are then washed in 2 X SSC, 
0.5% SDS at 55°C for at least fifteen minutes. For moderate stringency hybridization, 
nitrocellulose membranes (or other membranes suitable for blotting nucleic acid 
molecules) are hybridized in 6 X SSC, 5 X Denhardt's, 0.5% SDS at 42°C for at least 
one hour. The hybridized filters are then washed in 4 X SSC (or 6 X SSC), 0.5% 
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SDS at 30°C to 35°C for at least fifteen minutes. High stringency hybridization 
conditions are preferably used for hybridization to a nucleic acid molecule from a 
Forsythia species Moderate stringency hybridization conditions are preferably used 
for hybridization to a nucleic acid molecule from a species not included in the genus 
5 Forsythia. 

Presently preferred nucleic acid molecules useful for hybridizing to isolated 
nucleic acid molecules of the present invention include the nucleic acid molecules 
having the sequences set forth in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ 
ID NO: 7 and SEQ ID NO: 9. Hybridization in accordance with the present example 

10 can be achieved by any art-recognized hybridization procedure such as, for example, 
by utilizing the technique of hybridizing radiolabeled nucleic acid probes to nucleic 
acids immobilized on nitrocellulose filters or nylon membranes as set forth, for 
example, at pages 9.52 to 9.55 of Molecular Cloning, A Laboratory Manual (2nd 
edition), J. Sambrook, E.F. Fritsch and T. Maniatis eds., the cited pages of which are 

15 incorporated herein by reference. 

The foregoing stringent hybridization conditions can be used to identify 
nucleic acid molecules encoding secoisolariciresinol dehydrogenase protein from a 
wide range of plant genuses including, but not limited to Podocarpus, Tsuga, Pinus, 
Thuja, Araucaria, Juniperus, Taiwania, Virola, Piper, Arctium, Podophyllum and 

20 Linum. 

While the preferred embodiment of the invention has been illustrated and 
described, it will be appreciated that various changes can be made therein without 
departing from the spirit and scope of the invention. 
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The embodiments of the invention in which an exclusive property or privilege 
is claimed are defined as follows: 

1. An isolated nucleic acid molecule encoding a secoisolariciresinol 
dehydrogenase protein. 

2. A nucleic acid molecule of Claim 1 encoding a gymnosperm 
secoisolariciresinol dehydrogenase protein. 

3 . A nucleic acid molecule of Claim 2 encoding a secoisolariciresinol 
dehydrogenase protein from a genus selected from the group consisting of 
Podocarpus, Tsuga, Pinus, Thuja, Araucaria, Juniperus and Taiwania. 

4. A nucleic acid molecule of Claim 1 encoding an angiosperm 
secoisolariciresinol dehydrogenase protein. 

5. A nucleic acid molecule of Claim 4 encoding a secoisolariciresinol 
dehydrogenase protein from a genus selected from the group consisting of Virola, 
Piper, Arctium, Podophyllum and Linum. 

6. A nucleic acid molecule of Claim 1 encoding a secoisolariciresinol 
dehydrogenase protein from a Forsythia species. 

7. A nucleic acid molecule of Claim 6 encoding a secoisolariciresinol 
dehydrogenase protein from Forsythia intermedia. 

8. A nucleic acid molecule of Claim 7 encoding a secoisolariciresinol 
dehydrogenase protein having the amino acid sequence of any one of SEQ ID NO:2, 
SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO. 8 and SEQ ID NO: 10. 

9. A nucleic acid molecule of Claim 7 having the nucleic acid sequence of 
any one of SEQ ID NO: 1, SEQ ID NO:3, SEQ ID NO.5, SEQ ID NO:7 and SEQ ID 
NO:9. 

10. An isolated nucleic acid molecule that hybridizes under stringent 
conditions to a fragment of any one of the nucleic acid molecules set forth in SEQ ID 
NO:l, SEQ ID N0 3, SEQ ID NO:5, SEQ ID NO:7 and SEQ ID NO:9, said 
fragment having a length of at least 15 bases. 
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11. An isolated recombinant secoisolariciresinol dehydrogenase protein. 

12. An isolated recombinant gymnosperm secoisolariciresinol 
dehydrogenase protein of Claim 1 1 

13. An isolated recombinant gymnosperm secoisolariciresinol 
dehydrogenase protein of Claim 12, said protein occurring naturally in a gymnosperm 
genus selected from the group consisting of Podocarpus, Tsuga, Pinus, Thuja, 
Araucaria, Juniperus and Taiwania. 

14. An isolated recombinant angiosperm secoisolariciresinol 
dehydrogenase protein of Claim 1 1 . 

15. An isolated recombinant angiosperm secoisolariciresinol 
dehydrogenase protein of Claim 14, said protein occurring naturally in an angiosperm 
genus selected from the group consisting of Virola, Piper, Arctium, Podophyllum and 
Linum. 

16. An isolated recombinant Forsythia secoisolariciresinol dehydrogenase 
protein of Claim 1 1 . 

17. An isolated recombinant Forsythia secoisolariciresinol dehydrogenase 
protein of Claim 11, said protein having the amino acid sequence of any one of SEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8 and SEQ ID NO: 10. 

18. A replicable expression vector comprising a nucleic acid sequence 
encoding secoisolariciresinol dehydrogenase. 

19. A replicable expression vector of Claim 18 comprising a nucleic acid 
sequence encoding secoisolariciresinol dehydrogenase from a genus selected from the 
group consisting of Podocarpus, Tsuga, Pinus, Thuja, Araucaria, Juniperus, 
Taiwania, Virola, Piper, Arctium, Podophyllum and Linum. 

20. A replicable expression vector of Claim 18 comprising a nucleic acid 
sequence encoding secoisolariciresinol dehydrogenase having the biological activity of 
a protein having the amino acid sequence of any one of SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID NO:6, SEQ ID NO:8 and SEQ ID NO: 10. 
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21. A host cell comprising a vector of any one of Claim 18, Claim 19 or 
Claim 20. 

22. A method of enhancing the expression of secoisolariciresinol 
dehydrogenase protein in a suitable host cell comprising introducing into the host cell 
an expression vector that comprises a nucleotide sequence encoding a protein having 
the biological activity of a secoisolariciresinol dehydrogenase protein having the 
amino acid sequence set forth in any one of SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8 and SEQ ID NO: 10. 

23. A method of modifying the expression of secoisolariciresinol 
dehydrogenase protein in a suitable host cell comprising introducing into the host cell 
an expression vector that comprises a nucleotide sequence that expresses an RNA that 
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information as defined in Title 37, Code of Federal Regulations, Section 1.56(a) which 
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No. 25^096^. Jerald E. Nagae, Reg. No. 29,4 1J ; Dennis K. Shelton, Reg. No. 26,997; 
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these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code, and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 
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SEQUENCE LISTING 

<1]0> Xia, Zhi-Qiang 

Costa, Michael A 
Davm, Laurence B 
Lewis, Norman G 

<120> Recombinant Secoisolanciresmol Dehydrogenase, and 
methods of Use 

<130> wsurll3787 

<140> Her yet assigned 
<1<U> 1999-04-23 

cl50^ 60/082, 9 7 ; 
< ] b 1 > 1998-04-24 

cL60> 2:.- 

<170> Patent Tn ve-. 2.0 

<210> 1 

<2]1> 819 

<212> DNA 

<2 13> Forsythia x intermedia 

<220> 
<221> CDS 
<222> (1) . . (819) 

<-4 00> t 

atg cag cut cga act 
Met Gin Leu Arg Thr 



ctt ata aca qqa gga 
Leu lie Thr Gly Cly 
20 

ttc tec caa cat gga 
Phe Ser Gin His Gly 

35 

tta ggt cac tea gtt 
Leu Gly mis Ser Val 

50 

ate cac tgt gat gtt 
lie His Cys Asp Val 
65 

aac aca gtt tea acc 
Asn Thr Val Ser Thr 
85 

gga att tec gat cec 
Gly lie Ser Asp Pro 
100 



gca ttc gca aga agg eta gaa gga 
Ala Phe Ala Arg Arg Leu G3u GLy 



agt gga att gga gaa acc aca gca aaa etc 
Ser Gly lie Gly Glu Thr Thr Ala Lys Leu 



gtt gec att get gat gtc caa gat ga<= 
Val Ala He Ala Asp Val GLn Asp Glu 



gtc gag gec att ggc act tec aat tec acc tac 
Val Glu Ala He Gly Thr Ser Asn Ser Thr Tyr 



act aat gaa gac ggt gtt aaa aat gec gtg gac 
Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp 



tat gga aaa ctg gac att atg ttc age aat gca 
Tyr Gly Lys Leu Asp I Le Met Phe Ser Asn Ala 



aac agg ccc cgc ate ata gac aac gaa aaa gca 
Asn Arg Pro Arg lie lie Asp Asn Glu Lys Ala 
105 HO 
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gac ttt gaa cgc gtt etc agt gta aat gta acc gga gtt ttc eta tgc 

Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val Phe Leu Cys 

115 120 125 

atg aag cac gca gca cgt gtt atg att cca gca cgc agt ggc aac ata 

Met Lys His Ala Ala Arg Val Met lie Pro Ala Arg Ser Gly Asn lie 

130 135 140 

att tec act get agt tta age tea act atg ggt ggt ggt tct tea cat 

lie Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His 

145 150 155 160 

gec tat tgt ggt tea aag cat get gtg tta gec ctt act agg aat ctg 

Ala Tyr Cys Gly Ser Lys His Ala Val Leu Ala Leu Thr Arg Asn Leu 

165 170 175 

gca gtc gag etc gga caa ttt ggc att agg gtt aat tgt ttg tct cct 

Ala Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn Cys Leu Ser Pro 

180 185 190 

ttc ggg ctt cct acg get tta ggc aag aaa ttt tea ggg att aaa aat 

Phe Gly Leu Pro Thr Ala Leu Gly Lys Lys Phe Ser Gly lie Lys Asn 

195 200 205 

gaa gaa gaa ttt gag aat gta ata aac ttt gcg gga aat ttg aaa ggt 

Glu Glu Glu Phe Glu Asn Val lie Asn Phe Ala Gly Asn Leu Lys Gly 

210 215 220 

cca aaa ttt aat gtt gag gat gtt gee aat gca get ctt tat ctg get 

Pro Lys Phe Asn Val Glu Asp Val Ala Asn Ala Ala Leu Tyr Leu Ala 

225 230 235 240 

agt gat gag gca aaa tac gtg agt gga cac aat ctg ttc att gat gga 

Ser Asp Glu Ala Lys Tyr Val Ser Gly His Asn Leu Phe He Asp Gly 

245 250 255 

ggg ttc age gtc tgc aat tct gta ate aaa gtg ttc caa tat cca gat 

Gly Phe Ser Val Cys Asn Ser Val He Lys Val Phe Gin Tyr Pro Asp 

260 265 270 

tct 
Ser 



<210> 2 
<211> 273 
<212> PRT 

<213> Forsythia x intermedia 
<400> 2 

Met Gin Leu Arg Thr Ala Phe Ala Arg Arg Leu Glu Gly Lys Val Ala 
15 10 15 



Leu He Thr Gly Gly Ala Ser Gly He Gly Glu Thr Thr Ala Lys Leu 
20 25 30 
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Phe Ser Gin His Gly Ala Lys Val Ala lie Ala Asp Val Gin Asp Glu 
35 40 45 

Leu Gly His Ser Val Val Glu Ala lie Gly Thr Ser Asn Ser Thr Tyr 
50 55 60 

lie His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp 
65 70 75 80 

Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp lie Met Phe Ser Asn Ala 
85 90 95 

Gly lie Ser Asp Pro Asn Arg Pro Arg lie lie Asp Asn Glu Lys Ala 
100 105 110 

Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val Phe Leu Cys 
115 120 125 

Met Lys His Ala Ala Arg Val Met lie Pro Ala Arg Ser Gly Asn lie 
130 135 140 

lie Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His 
145 150 155 160 

Ala Tyr Cys Gly Ser Lys His Ala Val Leu Ala Leu Thr Arg Asn Leu 
165 170 175 

Ala Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn Cys Leu Ser Pro 
180 185 190 

Phe Gly Leu Pro Thr Ala Leu Gly Lys Lys Phe Ser Gly lie Lys Asn 
195 200 205 

Glu Glu Glu Phe Glu Asn Val lie Asn Phe Ala Gly Asn Leu Lys Gly 
210 215 220 

Pro Lys Phe Asn Val Glu Asp Val Ala Asn Ala Ala Leu Tyr Leu Ala 
225 230 235 240 

Ser Asp Glu Ala Lys Tyr Val Ser Gly His Asn Leu Phe lie Asp Gly 
245 250 255 

Gly Phe Ser Val Cys Asn Ser Val lie Lys Val Phe Gin Tyr Pro Asp 
260 265 270 



<210> 3 

<211> 831 

<212> DNA 

<213> Forsythia x intermedia 

<220> 

<221> CDS 

<222> (1) . . (831) 
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<400> 3 

atg gca gcc act tea cag gtt eta act gca ate gca aga agg eta gaa 

Met Ala Ala Thr Ser Gin Val Leu Thr Ala He Ala Arg Arg Leu Glu 
15 10 15 

gga aaa gtt gcc ctt ata aca gga gga gcc agt gga att gga gaa ace 
Gly Lys Val Ala Leu He Thr Gly Gly Ala Ser Gly He Gly Glu Thr 
20 25 30 

aca gca aaa etc ttc tec caa cat gga gcc aaa gtt gcc att get gat 
Thr Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp 
35 40 45 

gtc caa gat gaa tta ggt cac tea gtt gtc gag gcc att ggc act tec 
Val Gin Asp Glu Leu Gly His Ser Val Val Glu Ala lie Gly Thr Ser 
50 55 60 

aat tec acc tac ate cac tgt gat gtt act aat gaa gac ggt gtt aaa 
Asn Ser Thr Tyr He His Cys Asp Val Thr Asn Glu Asp Gly Val Lys 
65 70 75 80 

aat gcc gtg gac aac aca gtt tea acc tat gga aaa ctg gac att atg 
Asn Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met 
85 90 95 

ttc age aat gca gga att tct gat ccc aac agg ccc cgc ate ata gac 
Phe Ser Asn Ala Gly He Ser Asp Pro Asn Arg Pro Arg He He Asp 
100 105 HO 

aac gaa aaa gca gac ttt gaa cgc gtt ttc agt gta aat gta acc gga 
Asn Glu Lys Ala Asp Phe Glu Arg Val Phe Ser Val Asn Val Thr Gly 
115 120 125 

gtt ttc eta tgc atg aag cac gca gca cgt gtt atg att cca gca cgc 
Val Phe Leu Cys Met Lys His Ala Ala Arg Val Met He Pro Ala Arg 
130 135 140 

agt ggc aac ata att tec act get agt tta age tea act atg ggt ggt 
Ser Gly Asn He He Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly 
145 150 155 160 

ggt tct tea cat gcc tat tgt ggt tea aag cat get gtg tta ggc ctt 
Glv Ser Ser His Ala Tyr Cys Gly Ser Lys His Ala Val Leu Gly Leu 
165 170 175 

act agg aat ctg gca gtc gag etc gga caa ttt ggc att agg gtt aat 
Thr Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly He Arg Val Asn 
180 185 190 

tgt ttg tct cct ttc ggg ctt cct acg get tta ggc aag aaa ttt tea 
Cys Leu Ser Pro Phe Gly Leu Pro Thr Ala Leu Gly Lys Lys Phe Ser 
195 200 205 

ggg att aaa aat gaa gaa gaa ttt gag aat gta ata aac ttt gcg gga 
Gly He Lys Asn Glu Glu Glu Phe Glu Asn Val He Asn Phe Ala Gly 
210 215 220 

aat ctg aaa ggt cca aaa ttt aat gtt gag gat gtt gcc aat gca get 
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Asn Leu Lys Gly Pro Lys Phe Asn Val 
225 230 

ctt tat ctg get agt gat gag gca aaa 

Leu Tyr Leu Ala Ser Asp Glu Ala Lys 
245 

ttc att gat gga ggg ttc age gtc tgc 

Phe lie Asp Gly Gly Phe Ser Val Cys 

260 265 

caa tat cca gat tct 
Gin Tyr Pro Asp Ser 
275 



Glu Asp Val Ala Asn Ala Ala 
235 240 

tac gtg agt gga cac aat ctg 768 

Tyr Val Ser Gly His Asn Leu 

250 255 

aat tct gta ate aaa gtg ttc 816 

Asn Ser Val lie Lys Val Phe 
270 

831 



<210> 4 
<211> 277 
<212> PRT 

<213> Forsythia x intermedia 
<400> 4 

Met Ala Ala Thr Ser Gin Val Leu Thr Ala lie Ala Arg Arg Leu Glu 
15 10 15 

Gly Lys Val Ala Leu lie Thr Gly Gly Ala Ser Gly lie Gly Glu Thr 
20 25 30 

Thr Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala lie Ala Asp 
35 40 45 

Val Gin Asp Glu Leu Gly His Ser Val Val Glu Ala lie Gly Thr Ser 
50 55 60 

Asn Ser Thr Tyr lie His Cys Asp Val Thr Asn Glu Asp Gly Val Lys 
65 70 75 80 

Asn Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp lie Met 
85 90 95 

Phe Ser Asn Ala Gly lie Ser Asp Pro Asn Arg Pro Arg lie lie Asp 
100 105 110 

Asn Glu Lys Ala Asp Phe Glu Arg Val Phe Ser Val Asn Val Thr Gly 
115 120 125 

Val Phe Leu Cys Met Lys His Ala Ala Arg Val Met lie Pro Ala Arg 
130 135 140 

Ser Gly Asn lie lie Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly 
145 150 155 160 

Gly Ser Ser His Ala Tyr Cys Gly Ser Lys His Ala Val Leu Gly Leu 
165 170 175 

Thr Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn 
180 185 190 
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Cys Leu Ser Pro 
195 



Gly lie Lys Asn 
210 

Asn Leu Lys Gly 
225 

Leu Tyr Leu Ala 



Phe lie Asp Gly 
260 

Gin Tyr Pro Asp 
275 



Phe Gly Leu Pro 
200 

Glu Glu Glu Phe 
215 

Pro Lys Phe Asn 
230 

Ser Asp Glu Ala 
245 

Gly Phe Ser Val 



Ser 



Thr Ala Leu Gly 



Glu Asn Val lie 
220 



Val Glu Asp Val 
235 

Lys Tyr Val Ser 
250 

Cys Asn Ser Val 
265 



Lys Lys Phe Ser 
205 

Asn Phe Ala Gly 



Ala Asn Ala Ala 
240 

Gly His Asn Leu 
255 

lie Lys Val Phe 
270 



<210> 


5 




<211> 


819 




<212> 


DNA 




<213> 


Forsythia 


x intermedia 


<220> 






<221> 


CDS 




<222> 


(1) . - (819) 




<400> 


5 




atg a 


ig ctt cga 


act gca ate gca 


Met Gin Leu Arg 


Thr Ala lie Ala 


1 




5 



aga agg eta gaa gga aaa gtt gec 
Arg Arg Leu Glu Gly Lys Val Ala 
10 15 

ctt ata aca gga gga gec agt gga gtt gga gaa gtc aca gca aaa etc 
Leu lie Thr Gly Gly Ala Ser Gly Val Gly Glu Val Thr Ala Lys Leu 
20 25 30 

ttc tec caa cat gga gec aaa gtt gec att get gat gtc caa gat gaa 
Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp Val Gin Asp Glu 
35 40 45 

tta ggt cac tea gtt gtc gag gee att ggc cct tec aat tec acc tac 
Leu Gly His Ser Val Val Glu Ala He Gly Pro Ser Asn Ser Thr Tyr 
50 55 60 

ate cac tgc gat gtt act aat gaa gac ggt gtt aaa aat gec gtg gac 
lie His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp 
65 70 75 80 

aac aca gtt tea acc tat gga aaa ctg gac att atg ttc aac aat gca 
Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met Phe Asn Asn Ala 



gga att tct gat ccc tac aag ccc egg gtc ata gac aac gaa aaa gca 

Gly He Ser Asp Pro Tyr Lys Pro Arg Val He Asp Asn Glu Lys Ala 

100 105 HO 

gac ttt gaa cgc gtt etc agt gtn aat gtn acc gga gtt ttc eta ttt 
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Asp Phe Glu Arg Val Leu Ser Xaa Asn Xaa Thr Gly Val Phe Leu Phe 

115 120 125 

atg aag cac gca gca cgc att atg gtt cca gca cga aat ggc tgc ata 

Met Lys His Ala Ala Arg He Met Val Pro Ala Arg Asn Gly Cys He 

130 135 140 

att tec act get agt tta age tea act atg ggt ggt ggt tct tea cat 

He Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His 

145 150 155 160 

get tat tgt ggt gca aaa cat get gta tta ggc ctt act agg aat ctg 

Ala Tyr Cys Gly Ala Lys His Ala Val Leu Gly Leu Thr Arg Asn Leu 

165 170 175 

gca gtc gag etc gga caa ttt ggc att agg gtt aat tgt ttg tct cct 

Ala Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn Cys Leu Ser Pro 

180 185 190 

ttc ggg ctt cct acg cct eta gec aag aaa ttt tea ggg att gaa aat 

Phe Gly Leu Pro Thr Pro Leu Ala Lys Lys Phe Ser Gly He Glu Asn 

195 200 205 

gat gta gac ttt gcg aat gca ata gaa cat gcg gga aat ctg aaa ggt 

Asp Val Asp Phe Ala Asn Ala He Glu His Ala Gly Asn Leu Lys Gly 

210 215 220 

aca aaa ttg agg att gag gat gtt gee aat gca get ctt ttt ctg get 

Thr Lys Leu Arg He Glu Asp Val Ala Asn Ala Ala Leu Phe Leu Ala 

225 230 235 240 

agt gat gag gca caa tat gtg agt gga caa aat ctg ttc ate gat gga 

Ser Asp Glu Ala Gin Tyr Val Ser Gly Gin Asn Leu Phe He Asp Gly 

245 250 255 

ggg ttc age gtc tgc aat tct gca ate aaa atg ttc caa tat cca gac 
Gly Phe Ser Val Cys Asn Ser Ala He Lys Met Phe Gin Tyr Pro Asp 

260 265 270 



<210> 6 
<211> 273 
<212> PRT 

<213> Forsythia x intermedia 
<400> 6 

Met Gin Leu Arg Thr Ala He Ala Arg Arg Leu Glu Gly Lys Val Ala 
15 10 15 

Leu He Thr Gly Gly Ala Ser Gly Val Gly Glu Val Thr Ala Lys Leu 
20 25 30 



Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp Val Gin Asp Glu 
35 40 45 
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Leu Gly His Ser Val Val Glu Ala lie Gly Pro Ser Asn Ser Thr Tyr 
50 55 60 

lie His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp 
65 70 75 80 

Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp lie Met Phe Asn Asn Ala 
85 90 95 

Gly lie Ser Asp Pro Tyr Lys Pro Arg Val lie Asp Asn Glu Lys Ala 
100 105 110 

Asp Phe Glu Arg Val Leu Ser Xaa Asn Xaa Thr Gly Val Phe Leu Phe 
115 120 125 

Met Lys His Ala Ala Arg lie Met Val Pro Ala Arg Asn Gly Cys lie 
130 135 140 

lie Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His 
145 150 155 160 

Ala Tyr Cys Gly Ala Lys His Ala Val Leu Gly Leu Thr Arg Asn Leu 
165 170 175 

Ala Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn Cys Leu Ser Pro 
180 185 190 

Phe Gly Leu Pro Thr Pro Leu Ala Lys Lys Phe Ser Gly lie Glu Asn 
195 200 205 

Asp Val Asp Phe Ala Asn Ala lie Glu His Ala Gly Asn Leu Lys Gly 
210 215 220 

Thr Lys Leu Arg lie Glu Asp Val Ala Asn Ala Ala Leu Phe Leu Ala 
225 230 235 240 

Ser Asp Glu Ala Gin Tyr Val Ser Gly Gin Asn Leu Phe He Asp Gly 
245 250 255 

Gly Phe Ser Val Cys Asn Ser Ala He Lys Met Phe Gin Tyr Pro Asp 
260 265 270 



<210> 7 
<211> 831 
<212> DNA 

<213> Forsythia x intermedia 

<220> 

<221> CDS 

<222> (1) . . (831) 

<400> 7 

atg gcc agt act tea cag gtt eta act gca ate aca aga agg eta gaa 48 
Met Ala Ser Thr Ser Gin Val Leu Thr Ala He Thr Arg Arg Leu Glu 
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gga aaa gtt gcc ctt ata aca gga gga gcc agt gga att gga gaa ttc 
Glv Lvs Val Ala Leu He Thr Gly Gly Ala Ser Gly He Gly Glu Phe 
20 25 30 

aca gca aaa etc ttc tec caa cat gga gcc aaa gtt gcc att get gat 
Thr Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp 
35 40 45 

gtc caa gat gaa tta ggt cac tea gtt gtc gag gcc ate ggc act tec 
Val Gin Asp Glu Leu Gly His Ser Val Val Glu Ala He Gly Thr Ser 
50 55 60 

aat tec ate tac ate cac tgc gat gtt acc aat gaa gac gat gtt aaa 
Asn Ser He Tyr He His Cys Asp Val Thr Asn Glu Asp Asp Val Lys 
65 70 75 80 

aat gcc gtg gac aac aca gtt tea acc tat gga aaa ctg gac att atg 
Asn Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met 
85 90 95 

ttc aac aat gca gga att get gac ccc aac aag ccc cgc ate gta gac 
Phe Asn Asn Ala Gly He Ala Asp Pro Asn Lys Pro Arg He Val Asp 
100 105 HO 

aac gaa aaa gca gac ttt gaa cgc gtt etc age gta aat gta acc ggt 
Asn Glu Lys Ala Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly 
115 120 125 

gtt ttc eta tgc atg aag cac gca gca cgc gtt atg gtg cca gca cgc 
Val Phe Leu Cys Met Lys His Ala Ala Arg Val Met Val Pro Ala Arg 
130 135 140 

agt ggc age ata att tec act get age gta age tea aca att ggt ggt 
Ser Gly Ser He He Ser Thr Ala Ser Val Ser Ser Thr He Gly Gly 
145 150 155 160 

get get tea cat get tat tgt tgt tea aag cat get gtg tta ggc ctt 
Ala Ala Ser His Ala Tyr Cys Cys Ser Lys His Ala Val Leu Gly Leu 
165 170 175 

act agg aat ctg gca gtc gag etc gga caa ttt ggc att agg gtt aat 
Thr Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly He Arg Val Asn 
180 185 190 

tgt ttg get cct tac gcg ctt get acg cct tta gcc aag aaa ttt gta 
Cys Leu Ala Pro Tyr Ala Leu Ala Thr Pro Leu Ala Lys Lys Phe Val 
195 200 205 

ggg ctt gaa aat gac gaa gat ttg gag aat gca atg age ctt atg gga 
Gly Leu Glu Asn Asp Glu Asp Leu Glu Asn Ala Met Ser Leu Met Gly 
210 215 220 

aat ctg aaa ggt aca aat ttg aag get gag gac gtc gcc aat gca get 
Asn Leu Lvs Gly Thr Asn Leu Lys Ala Glu Asp Val Ala Asn Ala Ala 
225 230 235 240 
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ctt tat ctg gca agt gat gag gca aaa tat gtg agt gga cac aat ctg 768 
Leu Tyr Leu Ala Ser Asp Glu Ala Lys Tyr Val Ser Gly His Asn Leu 
245 250 255 

ttc att gat gga ggg ttc age gtc tac aat tct gca ate aaa atg ttc 816 
Phe lie Asp Gly Gly Phe Ser Val Tyr Asn Ser Ala lie Lys Met Phe 
260 265 270 

caa tat cca gac act 831 
Gln Tyr Pro Asp Thr 
275 

<210> 8 
<211> 277 
<212> PRT 

<213> Forsythia x intermedia 
<400> 8 

Met Ala Ser Thr Ser Gin Val Leu Thr Ala lie Thr Arg Arg Leu Glu 
15 10 15 

Gly Lys Val Ala Leu lie Thr Gly Gly Ala Ser Gly lie Gly Glu Phe 
20 25 30 

Thr Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala lie Ala Asp 
35 40 45 

Val Gin Asp Glu Leu Gly His Ser Val Val Glu Ala lie Gly Thr Ser 
50 55 60 

Asn Ser He Tyr He His Cys Asp Val Thr Asn Glu Asp Asp Val Lys 
65 70 75 80 

Asn Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met 
85 90 95 

Phe Asn Asn Ala Gly lie Ala Asp Pro Asn Lys Pro Arg lie Val Asp 
100 105 HO 

Asn Glu Lys Ala Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly 
115 120 125 

Val Phe Leu Cys Met Lys His Ala Ala Arg Val Met Val Pro Ala Arg 
130 135 140 

Ser Gly Ser lie lie Ser Thr Ala Ser Val Ser Ser Thr He Gly Gly 
145 150 155 160 

Ala Ala Ser His Ala Tyr Cys Cys Ser Lys His Ala Val Leu Gly Leu 
165 170 175 

Thr Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly He Arg Val Asn 
180 185 190 

Cys Leu Ala Pro Tyr Ala Leu Ala Thr Pro Leu Ala Lys Lys Phe Val 
195 200 205 



WO 99/55846 



-11- 



PCT/US99/08975 



Gly Leu Glu Asn Asp 
210 

Asn Leu Lys Gly Thr 
225 

Leu Tyr Leu Ala Ser 
245 



Phe He Asp Gly Gly 
260 



Gin Tyr Pro Asp Thr 
275 



Glu Asp Leu Glu Asn Ala 
215 



Asn Leu Lys Ala Glu Asp 
230 235 

Asp Glu Ala Lys Tyr Val 
250 



Phe Ser Val Tyr Asn Ser 
265 



Met Ser Leu Met Gly 
220 

Val Ala Asn Ala Ala 
240 



Ser Gly His Asn Leu 
255 

Ala lie Lys Met Phe 
270 



<210> 9 

<211> 828 

<212> DNA 

<213> Forsythia x intermedia 

<220> 

<221> CDS 

<222> (1) - - (828) 

<400> 9 

atg gcc act tea cag ctt cga act gca ttc gca aga agg eta gaa gga 

Met Ala Thr Ser Gin Leu Arg Thr Ala Phe Ala Arg Arg Leu Glu Gly 

15 10 15 

aaa gtt gcc ctt ata aca gga gga gcc agt gga gtt gga gaa gtc aca 
Lys Val Ala Leu He Thr Gly Gly Ala Ser Gly Val Gly Glu Val Thr 
20 25 30 

gca aaa etc ttc tec caa cat gga gcc aaa gtt gcc att get gat gtc 
Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp Val 
35 40 45 

caa gat gaa tta ggt cac tea gtt gtc gag gcc att ggc ctt tec aat 
Gin Asp Glu Leu Gly His Ser Val Val Glu Ala He Gly Leu Ser Asn 
50 55 60 

tec ace tac ate cac tgc gat gtt act aat gaa gac ggt gtt aaa aat 
Ser Thr Tyr He His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn 
65 70 75 80 

gcc gtg gac aac aca gtt tea ace tat gga aaa ctg gac att atg ttc 
Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met Phe 
85 90 95 

aac aat gca gga att tct gat ccc tac aag ccc egg gtc ata gac aac 
Asn Asn Ala Gly He Ser Asp Pro Tyr Lys Pro Arg Val He Asp Asn 
100 105 HO 



gaa aa; 
Glu Ly: 



gca gac ttt gaa cgc gtt etc agt gtt aat gta acc gga gtt 
Ala Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val 
115 120 125 
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ttc eta ttt atg aag cac gca gca cgc att atg gtt cca gca cga agt 
Phe Leu Phe Met Lys His Ala Ala Arg He Met Val Pro Ala Arg Ser 
130 135 140 

qgc tgc ata att tec act get agt tta age tea act atg ggt ggt ggt 
Gly Cys He He Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly 
145 150 155 150 

tct tea cat get tat tgt ggt tea aag cat get gta tta ggc ctt act 
Ser Ser His Ala Tyr Cys Gly Ser Lys His Ala Val Leu Gly Leu Thr 
165 1"?0 i75 

agg aat ctg gca gtc gag etc gga caa ttt ggc att agg gtt aat tgt 
Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly He Arg Val Asn Cys 
180 185 190 

ttg tct cct ttc ggg ctt cct acg cct tta gec aag aaa ttt aca ggg 
Leu Ser Pro Phe Gly Leu Pro Thr Pro Leu Ala Lys Lys Phe Thr Gly 
195 200 205 

att gaa aat gat gaa gac ttg gcg aat gga ata gaa cgt gcg gga aat 
He Glu Asn Asp Glu Asp Leu Ala Asn Gly He Glu Arg Ala Gly Asn 
210 215 220 

etg aaa ggt aca aaa ttg agg att gag gat gtt gec aat gca get ctt 
Leu Lys Gly Thr Lys Leu Arg He Glu Asp Val Ala Asn Ala Ala Leu 
225 230 235 24U 

ttt ctg get agt gat gag gca caa tat gtg agt gga caa aat ctg ttc 
Phe Leu Ala Ser Asp Glu Ala Gin Tyr Val Ser Gly Gin Asn Leu Phe 
245 250 255 

ate gat gga ggg ttc age gtc tgc aat tct gca ate aaa ttg ttc caa 
He Asp Gly Gly Phe Ser Val Cys Asn Ser Ala He Lys Leu Phe Gin 
260 265 270 

tat cca gac tct 
Tyr Pro Asp Ser 
275 



<210> 10 
<211> 276 
<212> PRT 

<213> Forsythia x intermedia 
<400> 10 

Met Ala Thr Ser Gin Leu Arg Thr Ala Phe Ala Arg Arg Leu Glu Gly 
15 10 15 

Lys Val Ala Leu He Thr Gly Gly Ala Ser Gly Val Gly Glu Val Thr 
20 25 30 

Ala Lys Leu Phe Ser Gin His Gly Ala Lys Val Ala He Ala Asp Val 
35 40 45 

Gin Asp Glu Leu Gly His Ser Val Val Glu Ala He Gly Leu Ser Asn 
55 60 
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Ser Thr Tyr He His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn 
65 70 "75 80 

Ala Val Asp Asn Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met Phe 
85 90 95 

Asn Asn Ala Gly He Ser Asp Pro Tyr Lys Pro Arg Val lie Asp Asn 
100 105 HO 

Glu Lys Ala Asp Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val 
115 120 125 

Phe Leu Phe Met Lys His Ala Ala Arg He Met Val Pro Ala Arg Ser 
130 135 140 

Glv Cys He He Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly 
145 150 155 160 

Ser Ser His Ala Tyr Cys Gly Ser Lys His Ala Val Leu Gly Leu Thr 
165 170 175 

Arg Asn Leu Ala Val Glu Leu Gly Gin Phe Gly He Arg Val Asn Cys 
180 185 190 

Leu Ser Pro Phe Gly Leu Pro Thr Pro Leu Ala Lys Lys Phe Thr Gly 
195 200 205 

He Glu Asn Asp Glu Asp Leu Ala Asn Gly He Glu Arg Ala Gly Asn 
210 215 220 

Leu Lys Gly Thr Lys Leu Arg He Glu Asp Val Ala Asn Ala Ala Leu 
225 230 235 240 

Phe Leu Ala Ser Asp Glu Ala Gin Tyr Val Ser Gly Gin Asn Leu Phe 
245 250 255 

He Asp Glv Gly Phe Ser Val Cys Asn Ser Ala He Lys Leu Phe Gin 
260 265 270 

Tyr Pro Asp Ser 
275 



<210> 11 
<211> 21 
<212> PRT 

<213> Forsythia x intermedia 
<220> 

<221> PEPTIDE 
<222> (1) . - (21) 

<223> N-terminal peptide of F. intermedia 

secoisolariciresinol protein wherein Xaa at 
positions 3, 12 and 20 represents an unidentified 
amino acid residue 



<400> 11 



WO 99/55846 



-14- 



PCT/US99/08975 



Gin Val Xaa Thr Ala He Ala Arg Asp Leu Glu Xaa Lys Val Ala Leu 
1 5 10 I 5 

He Thr Gly Xaa Ala 
20 



<210> 12 
<211> 17 
<212> PRT 

<213> Forsythia x intermedia 

Vai°Ala 2 Leu He Thr Gly Gly Ala Ser Gly He Gly Glu Thr Thr 
15 10 15 



<210> 13 

<211> 15 

<212> PRT 

<213> Forsythia 



<400> 13 

Leu Asn He Met Phe Ser Asn Ala Gly He Ser Asp 
c in 



<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_f eature 
<222> (1) . . (20) 

<223> PCR primer wherein n at positions 3, 9, 15 
represents inosine 

<400> 14 

ggnathggng aracnacngc 



<210> 15 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 
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<220> 

<221> misc_f eature 
<222> (1) . . (20) 

<223> PCR primer wherein n at positions 3 and 9 
represents inosine 

<400> 15 

ccngcrttng araacatdat 20 



<210> 16 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_feature 
<222> (1) . . (20) 

<223> PCR primer wherein n at positions 3 and 9 
represents inosine 

<400> 16 

ccngcrttnc traacatdat 20 



<210> 17 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_feature 
<222> (1) . . (20) 
<223> PCR primer 

<400> 17 

attccgctag attgcattga 20 



<210> 18 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> raise feature 
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<222> (1) . . (20) 
<223> PCR primer whereii 
represent inosine 



at positions 3 and 9 



<400> 18 

ccngcrttnc traacatdat 



20 



<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_f eature 
<222> (1) . . (20) 
<223> T7 PCR primer 

<400> 19 

aattaaccct cactaaaggg 20 



<210> 20 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_feature 
<222> (1) . . (23) 
<22 3> PCR primer 



<210> 21 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_feature 
<222> (1) . . (22) 
<223> T7 PCR primer 



<400> 20 

cagcttcgaa ctgcattcgc aag 



23 
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<400> 21 

cgggatatca ctcagcataa tg 



<210> 22 

<211> 816 

<212> DNA 

<213> Forsythia x intermedia 

<220> 

<221> CDS 

<222> (1) - . (816) 

<400> 22 

cag ctt cga act gca ttc gca aga agg eta gaa gga aaa gtt gec ctt 

Gin Leu Arg Thr Ala Phe Ala Arg Arg Leu Glu Gly Lys Val Ala Leu 
15 10 15 

ata aca gga gga gec agt gga att gga gaa acc aca gca aaa etc ttc 
lie Thr Gly Gly Ala Ser Gly lie Gly Glu Thr Thr Ala Lys Leu Phe 



tec caa cat gga gec aaa gtt gec att get gat gtc caa gat gaa tta 

Ser Gin His Gly Ala Lys Val Ala lie Ala Asp Val Gin Asp Glu Leu 
35 40 45 

ggt cac tea gtt gtc gag gec att ggc act tec aat tec acc tac ate 

Gly His Ser Val Val Glu Ala lie Gly Thr Ser Asn Ser Thr Tyr lie 



cac tgt gat gtt act aat gaa gac ggt gtt aaa aat gec gtg gac aac 

His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp Asn 

65 70 75 80 

aca gtt tea acc tat gga aaa ctg gac att atg ttc age aat gca gga 

Thr Val Ser Thr Tyr Gly Lys Leu Asp lie Met Phe Ser Asn Ala Gly 

85 90 95 

att tct gat ccc aac agg ccc cgc ate ata gac aac gaa aaa gca gac 

lie Ser Asp Pro Asn Arg Pro Arg lie lie Asp Asn Glu Lys Ala Asp 

100 105 110 

ttt gaa cgc gtt etc agt gta aat gta acc gga gtt ttc eta tgc atg 

Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val Phe Leu Cys Met 

115 120 125 

aag cac gca gca cgt gtt atg att cca gca cgc agt ggc aac ata att 

Lys His Ala Ala Arg Val Met lie Pro Ala Arg Ser Gly Asn lie lie 

130 135 140 

tec act get agt tta age tea act atg ggt ggt ggt tct tea cat gee 

Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His Ala 

145 150 155 160 

tat tgt ggt tea aag cat get gtg tta gec ctt act agg aat ctg gca 

Tyr Cys Gly Ser Lys His Ala Val Leu Ala Leu Thr Arg Asn Leu Ala 

165 170 175 
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gtc gag etc gga caa ttt ggc att agg gtt aat tgt ttg tct cct ttc 576 
Val Glu Leu Gly Gin Phe Gly He Arg Val Asn Cys Leu Ser Pro Phe 
180 185 190 

ggg ctt cct acg get tta ggc aag aaa ttt tea ggg att aaa aat gaa 624 
Gly Leu Pro Thr Ala Leu Gly Lys Lys Phe Ser Gly He Lys Asn Glu 
195 200 205 

gaa gaa ttt gag aat gta ata aac ttt gcg gga aat ttg aaa ggt cca 672 
Glu Glu Phe Glu Asn Val He Asn Phe Ala Gly Asn Leu Lys Gly Pro 
210 215 220 

aaa ttt aat gtt gag gat gtt gec aat gca get ctt tat ctg get agt 720 
Lys Phe Asn Val Glu Asp Val Ala Asn Ala Ala Leu Tyr Leu Ala Ser 
225 230 235 240 

gat gag gca aaa tac gtg agt gga cac aat ctg ttc att gat gga ggg 768 
Asp Glu Ala Lys Tyr Val Ser Gly His Asn Leu Phe lie Asp Gly Gly 
245 250 255 

ttc age gtc tgc aat tct gta ate aaa gtg ttc caa tat cca gat tct 816 
Phe Ser Val Cys Asn Ser Val He Lys Val Phe Gin Tyr Pro Asp Ser 
260 265 270 

<210> 23 
<211> 272 
<212> PRT 

<213> Forsythia x intermedia 
<400> 23 

Gin Leu Arq Thr Ala Phe Ala Arg Arg Leu Glu Gly Lys Val Ala Leu 
15 10 15 

He Thr Gly Gly Ala Ser Gly He Gly Glu Thr Thr Ala Lys Leu Phe 
20 25 30 

Ser Gin His Gly Ala Lys Val Ala He Ala Asp Val Gin Asp Glu Leu 
35 40 45 

Gly His Ser Val Val Glu Ala He Gly Thr Ser Asn Ser Thr Tyr He 
50 55 60 

His Cys Asp Val Thr Asn Glu Asp Gly Val Lys Asn Ala Val Asp Asn 
65 70 75 80 

Thr Val Ser Thr Tyr Gly Lys Leu Asp He Met Phe Ser Asn Ala Gly 
85 90 95 

He Ser Asp Pro Asn Arg Pro Arg He He Asp Asn Glu Lys Ala Asp 
100 105 HO 

Phe Glu Arg Val Leu Ser Val Asn Val Thr Gly Val Phe Leu Cys Met 
115 120 125 

Lys His Ala Ala Arg Val Met He Pro Ala Arg Ser Gly Asn He He 
130 135 140 
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Ser Thr Ala Ser Leu Ser Ser Thr Met Gly Gly Gly Ser Ser His Ala 
145 150 155 160 

Tyr Cys Gly Ser Lys His Ala Val Leu Ala Leu Thr Arg Asn Leu Ala 
165 170 175 

Val Glu Leu Gly Gin Phe Gly lie Arg Val Asn Cys Leu Ser Pro Phe 
180 185 190 

Gly Leu Pro Thr Ala Leu Gly Lys Lys Phe Ser Gly He Lys Asn Glu 
195 200 205 

Glu Glu Phe Glu Asn Val He Asn Phe Ala Gly Asn Leu Lys Gly Pro 
210 215 220 

Lys Phe Asn Val Glu Asp Val Ala Asn Ala Ala Leu Tyr Leu Ala Ser 
225 230 235 240 

Asp Glu Ala Lys Tyr Val Ser Gly His Asn Leu Phe He Asp Gly Gly 
245 250 255 

Phe Ser Val Cys Asn Ser Val He Lys Val Phe Gin Tyr Pro Asp Ser 
260 265 270 



<210> 24 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> misc_feature 
<222> (1) . . (33) 
<223> PCR primer 

<400> 24 

acatatgcag cttcgaactg cattcgcaag aag 33 



<210> 25 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
oligonucleotide 

<220> 

<221> iru.sc_f eature 
<222> (1) . . (33) 
<223> PCR primer 

<400> 25 
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catatgggca gacatgttac atgatcaatt gca 



